Thursday, May 02, 2024

SQL Essential Training - LinkedIn

  • Datum - piece of information
  • Data is plural of datum. Data are piece of information - text, images or video.
  • Database - collection of data. Organized in many ways - tables. 
  • Tables - rows, and columns. Excel spreadsheet.
  • Col specifies attribute of that data.

An RDBMS (Relational Database Management System) is a program used to create, update, and manage relational databases. In a relational database, data is organized into tables, with each table containing rows (also known as records or tuples) and columns (attributes). Here are some key points about RDBMS:

  1. Table Structure:
    • An RDBMS structures information in tables, rows, and columns.
    • Each table represents a specific type of data (e.g., Customers, Orders).
    • Columns define the attributes (e.g., Customer ID, Order Date).
    • Rows contain actual data entries (e.g., individual customer records).
  2. Relationships:
    • RDBMS allows establishing relationships between tables using common attributes.
    • Instead of hierarchical structures, data is stored in related tables.
    • Primary keys uniquely identify rows, and foreign keys link related data.
  3. Example:
    • Consider a Customer table and an Order table:
      • Customer Table:
        • Customer ID (primary key)
        • Customer name
        • Billing address
        • Shipping address
      • Order Table:
        • Order ID (primary key)
        • Customer ID (foreign key)
        • Order date
        • Shipping date
        • Order status
    • By linking the Customer ID in both tables, we establish a relationship.
  4. Well-Known RDBMSs:
    • Some popular RDBMSs include MySQL, PostgreSQL, MariaDB, Microsoft SQL Server, and Oracle Database.

Here are some of the most popular Relational Database Management Systems (RDBMS):

  1. Oracle: As of September 2023, Oracle is the most popular RDBMS in the world, with a ranking score of 1240.88. It also holds the top position overall among all DBMS11.

  2. MySQL: MySQL is widely used and known for its open-source nature. It’s a popular choice for web applications and small to medium-sized databases.

  3. Microsoft SQL Server: Developed by Microsoft, SQL Server is commonly used in enterprise environments. It offers robust features, scalability, and integration with other Microsoft products.

  4. PostgreSQL: PostgreSQL is an open-source RDBMS known for its extensibility, ACID compliance, and support for advanced data types. It’s popular among developers and data professionals.

  5. IBM DB2: IBM DB2 is an enterprise-grade RDBMS with features like high availability, security, and scalability. It’s commonly used in large organizations.

  6. Microsoft Access: While not as powerful as the others, Microsoft Access is widely used for small-scale databases and desktop applications.

  7. SQLite: SQLite is a lightweight, embedded RDBMS often used in mobile apps and small projects.



WSDA Music


Company management wants to know what can we learn from the data? is there any useful info about sales, cust demographics, any ways company can improve / expand sales.

SQLite

SQLite is not a lighter version of SQL itself; rather, it is a lightweight relational database management system (RDBMS) that adheres to SQL specifications. Let’s explore why it’s named as such:


  1. Lightweight and Embedded:

    • SQLite focuses on providing a powerful SQL-compatible database without overheads or dependencies.
    • As the name implies, it’s a lightweight solution that can run on almost anything that supports C and persistent file storage.
    • Unlike traditional database systems that require a separate server process, SQLite is serverless and integrates directly into the application it serves.
  2. Key Features:

    • Embeddable: SQLite is embedded within the application, eliminating the need for a separate database server.
    • SQL Compatibility: Despite its lightweight nature, SQLite supports a vast majority of SQL standard features, making it robust enough for various applications.
    • File-Based: It operates directly on files, making it easy to manage and distribute.
  3. Use Cases:

    • Mobile Devices: SQLite is commonly used in mobile devices (such as Android and iOS) due to its small footprint and efficient storage.
    • Embedded Systems: It’s also popular in embedded systems, IoT devices, and desktop applications.
    • Testing and Prototyping: Developers often use SQLite for testing, prototyping, and small-scale projects.



Functions of each tab in DB Browser for SQLite:

  1. Database Structure:

    • In this tab, you can:
      • Create new database tables.
      • List existing database tables.
      • Delete database tables.
      • Define the structure of your database by specifying table names, columns, and their data types.
  2. Browse Data:

    • Here, you can:
      • View the actual data stored in your tables.
      • Browse through rows and columns.
      • Add new rows or modify existing data.
      • Essentially, it allows you to interact with the data in your database.
  3. Edit Pragmas:

    • The Edit Pragmas tab deals with system-wide parameters (pragmas) related to SQLite.
    • Pragmas are special commands that control various aspects of SQLite behavior.
    • You won’t typically need to change these settings unless you have specific requirements.
  4. Execute SQL:

    • This tab allows you to:
      • Write and execute SQL queries directly.
      • Query your database for specific information.
      • Inspect query results.
      • Perform operations like SELECT, INSERT, UPDATE, and DELETE.

Aliases



Operator Types



  • Select * from invoice where total in (1.98, 3.96)
  • Select * from Invoice where BillingCity in('Brussels', 'Orlando', 'Paris')
  • Select * from Invoice where BillingCity Like ('b%')
  • select * from invoice where total>1.98 AND (BillingCity like 'p%' OR BillingCity like 'd%')
  • Case statement
SELECT *,
CASE
When total < 2 THEN 'Baseline Purchase'
When total BETWEEN 2 and 6.99 Then 'Low Purchase'
When total between 7 and 15 then 'Target purchase'
Else 'Top performer'
END AS PurchaseType
From Invoice;

 


Filtering only the top performers by including the where clause....


JOINS

Getting data from 2 or more tables in a single SQL statement

Full list of customers (firstname and lastname) against all the invoices generated against that customer.

Entity-Relationship diagram for Invoice and Customer relationship.  


Select * from Invoice
INNER JOIN
Customer
ON
Invoice.CustomerID = Customer.CustomerID
order by customer.CustomerId

Aliasing

Select c.CustomerId, c.LastName, c.FirstName, i.InvoiceId, i.InvoiceDate from Invoice as i
INNER JOIN
Customer as c
ON
i.CustomerID = c.CustomerID
order by c.CustomerId


Discrepancies between tables are handled with different join types


  • Customer with customerid 6 cannot be found in the customer table.
  • Customers with customerid 1 and 5  do not have entries in the invoice table. 
Inner Join

  • Inner join returns only matchting records.
  • Any umatched data from either tables is ignored.

Inner join will ignore customer 6 and customer 1 and 5. 






Sunday, April 28, 2024

RAG - Retrieval Augmented Generation AI

Courtesy: Databricks.com

Retrieval augmented generation, or RAG, is an architectural approach that can improve the efficacy of large language model (LLM) applications by leveraging custom data. This is done by retrieving data/documents relevant to a question or task and providing them as context for the LLM. RAG has shown success in support chatbots and Q&A systems that need to maintain up-to-date information or access domain-specific knowledge.

RAG is the right place to start, being easy and possibly entirely sufficient for some use cases. Fine-tuning is most appropriate in a different situation, when one wants the LLM's behavior to change, or to learn a different "language." These are not mutually exclusive. As a future step, it's possible to consider fine-tuning a model to better understand domain language and the desired output form — and also use RAG to improve the quality and relevance of the response.


When I want to customize my LLM with data, what are all the options and which method is the best (prompt engineering vs. RAG vs. fine-tune vs. pretrain)?

There are four architectural patterns to consider when customizing an LLM application with your organization's data. These techniques are outlined below and are not mutually exclusive. Rather, they can (and should) be combined to take advantage of the strengths of each.

Friday, April 26, 2024

LLM / SLM Parameters

What do we understand by LLM or SLM Parameters?

**Parameters** in deep learning, including language models, are adjustable values that control the behavior of neural networks. These parameters are learned during training and determine how the model processes input data.

In LLMs and SLMs, parameters typically include:

1. **Weight matrices**: These matrices contain the numerical values that are multiplied by input vectors to produce output activations.

2. **Bias terms**: These are additive constants added to the weighted sum of inputs to adjust the activation function's output.

3. **Learned embeddings**: These are fixed-size vector representations of words, phrases, or tokens learned during training.

The number and complexity of these parameters directly impact the model's performance, accuracy, and computational requirements. More parameters often allow for more nuanced learning and better representation of complex linguistic patterns, but also increase the risk of overfitting and computational costs.

In the context of LLMs, having **billions** of parameters means that the model has an enormous number of adjustable values, allowing it to capture subtle relationships between words, contexts, and meanings. This complexity enables LLMs to achieve impressive results in tasks like language translation, question answering, and text generation.

Conversely, SLMs typically have fewer parameters (often in the tens or hundreds of thousands), which makes them more efficient but also less capable of capturing complex linguistic patterns.

SQL: A Practical Introduction for Querying Databases (IBM)

This is a refresher course. I already have a diploma in Oracle RDBMS.

Structured Query Language

  • A language for relational databases
  • Used to query data
Data

Collection of facts (words, numbers), pictures. 
  • Data needs to be secured, and accessed when required.
  • The above can be achieved using a Database.
Database
  • Is a repository of data / it is a program that stores data.
  • Provides functionality for adding, modifying and querying data.
Different kinds of datbases
  • Relational database
    • Data is stored in tabular form - columns & rows
    • Like in spreadsheet
    • Cols - has properties about each item - such as last name, first name, email address, etc.
    • Table is a collection of related things - example employees, salary, etc.
    • In a relational database, you can form relationship between tables. 
    • Emp table, Salary table, etc. etc.
    • RDBMS
  • DMBS
    • Database Management System - set of software tools for the data in the database is called DBMS.
    • Database is a repository of data  
    • Terms database, database server, database system, data server, DBMS are all used interhangeably.
RDBMS
  • MySQL, DB2, Oracle
SQL
  • Create table
  • Insert data to a table
  • Select statement to see the data in a table
  • Update data in a table
  • Delete data from the table

Types of SQl Statements

- Data Definition Language
- Data Manipulation Language
- Data Control Language
- Transaction Control Language
Image courtesy: Geeksforgeeks.com

Results coming in from an SQL query is called a table or a Result Set.

Define Table structure of an existing table

sp_help BOOK;



DML - Read and Modify data

Select Statement
  • Select col1, col2, col3 from FilmLocations;
  • Select * from FilmLocations;
  • Select count(*) from FilmLocations;
  • Select DISTINCT(Directors) from FilmLocations;
  • Select * from FilmLocations LIMIT 25;
  • SELECT DISTINCT Title FROM FilmLocations WHERE ReleaseYear=2015 LIMIT 3 OFFSET 5; Retrieve the next 3 film names distinctly after first 5 films released in 2015.
INSERT Statement
  • Insert into TableName ColumnName1, ColumnName2, ColumnNameN values <Value1>, <Value2>, <ValueN>;
  • Inserting 1 row at a time:
    • Insert into AUTHOR (Author_ID, Lastname, Firstname, Email, City, Country) values ('A1', 'Chong', 'Raul', 'rfc@ibm.com', 'Toronto', 'CA');
  • Multiple rows can be inserted.
    • Insert into Author (Author_ID, Lastname, Firstname, Email, City, Country) 
                    values
                    
                    ('A1', 'Chong', 'Raul', 'rfc@ibm.com', 'Toronto', 'CA')
                    ('A2', 'Ahuga', 'Rav', 'ra@ibm.com', 'Toronto', 'CA')

                     ); 
  • Insert into Instructor(ins_id, lastname, firstname, city, country) VALUES (8, 'Ryan', 'Steve', 'Barlby', 'GB'), (9, 'Sannareddy', 'Ramesh', 'Hyderabad', 'IN');   

Entity name - Author
Entity Attibutes - Author_ID, Country, City, Email, FirstName, LastName.

Update statement
  • Alter data in a table using UPDATE statement.
  • Update TableName SET ColumnName1=Value1 WHERE [condition];
  • Example, Update AUTHOR SET Lastname='KATTA', Firstname='Lakshmi' WHERE AUTHOR_Id='A2';
  • If where clause is not specified all rows will be updated.

Delete statement
  • Read and modify data.
  • Delete from TableName WHERE <condition>;
  • Delete from AUTHOR WHERE Author_ID in (a1, a2); both the rows will be deleted.
  • If where clause is not specified all rows will be deleted.


Relational Databases

1. RELATIONAL MODEL



2. E-R MODEL (Entity - Relationship Model)

Book - entity (drawn as rectanges in E-R diagrams)
Attribute - Title, Description, etc. (drawn as ovals in E-R diagrams)


  • Entity Book becomes a table in a database. 
  • Attributes become the columns.


PK uniquely identifies each Tuple / Row in a table. 



Foreign Keys are Primary Keys defined in other tables. They create link between the tables. In the above example, Author ID is a primary key in Author table, but in the Author List table, it is a Foreign Key linking the table Author List to the table Author.

Common Datatypes include Characters (VARCHAR), Numbers, and Date/Times.

DDL VS DML

  • DDL - define, change and drop data. 
    • CREATE
    • ALTER
    • TRUNCATE
    • DROP
  • DML - read and modify data in tables. 
    • Also known as CRUD operations.
    • INSERT
    • SELECT
    • UPDATE
    • DELETE
CREATE



Create table AUTHOR (

AUTHOR_ID  Int PRIMARY KEY NOT NULL,
Lastname varchar (30) NOT NULL,
firstname varchar (30) NOT NULL,
email varchar (30),
city varchar (30),
country varchar (30)

)

select * from AUTHOR

Create table BOOK (

BOOK_ID  Int PRIMARY KEY NOT NULL,
Title varchar (30),
Edition Int,
BYear Int,
Price Decimal (5,2),
ISBN varchar (6),
Pages Int,
Aisle Int,
Description varchar (80)
)

Select * from BOOK;


Create table BORROWER (

BORROWER_ID  Int PRIMARY KEY NOT NULL,
Lastname varchar (30) NOT NULL,
firstname varchar (30) NOT NULL,
email varchar (30),
Phone Int,
Address varchar (80),
City varchar (30),
Country varchar (30),
BRDescription varchar (80)

)

select * from BORROWER;


Create table AUTHOR_LIST (
AUTHOR_ID Int Foreign Key REFERENCES AUTHOR(AUTHOR_ID),
BOOK_ID Int Foreign Key REFERENCES BOOK(BOOK_ID),
AuthRole varchar (30)

)

Create table COPY (
COPY_ID Int PRIMARY KEY NOT NULL,
BOOK_ID Int Foreign Key REFERENCES BOOK(BOOK_ID),
STATUS varchar (6)
)

ALTER, DROP, TRUNCATE Tables

  1. Alter
    1. Add / Remove columns
    2. Modify datatype of a col
    3. Add / remove keys
    4. Add / remove constraints
            Alter table <table_name>
            Add col <datatype>,
            Add col2<datatype>;

Add cols

Alter table AUTHOR 
Add Qualification varchar(30);

Delete cols           

           Alter table BOOK
           drop COLUMN Qual;

Modify datatype of a col

ALTER TABLE AUTHOR
ALTER COLUMN Qualification VARCHAR(4); //Instead of Varchar(30) or you can modify datatype to another (string to int, etc.). 

Constraints

Create table myKeyConstraints (

sno int PRIMARY KEY NOT NULL,

firstname varchar(30),

lastname varchar(30)

);

sp_help mykeyconstraints; // Check the constraint name highlighted below....


Drop constraint

alter table myKeyConstraints
drop constraint PK__myKeyCon__DDDF64469E1B9A37;

// Constraint is gone...


PRIMARY KEY = UNIQUE + NOT NULL








LLM Quality vs Size small language models

 Courtesy: Microsoft.com 




Thursday, April 25, 2024

Hadoop Ecosystem

 Courtesy: data-flair.training, bizety

  • Hadoop provides a distributed storage and processing framework.
    • Hadoop is a framework for distributed storage and processing of large datasets across clusters of computers. 
    • It includes two main components: 
      • Hadoop Distributed File System (HDFS) for storage and 
      • MapReduce for processing. 
    • Hadoop is designed to store and process massive amounts of data in a fault-tolerant and scalable manner.
  • Hive is a data warehouse infrastructure built on top of Hadoop. It provides a SQL-like interface for querying and analyzing data stored in Hadoop. Hive is suitable for data warehousing and analytics use cases.
  • PySpark enables data processing using the Spark framework with Python.
  • YARN manages cluster resources to support various processing frameworks.
  • HBase provides a scalable NoSQL database solution for real-time access to large datasets.

Tuesday, April 23, 2024

AI and LLMs - why GPUs, what happened to CPUs?

This has been written with the help of Copilot.

Central Processing Units (CPUs)

  • CPUs are versatile and handle a wide range of tasks (execution of instructions, general purpose computing, coordinating various components within the computer system (memory, I/O, peripheral devices, etc.).
  • They are the “brains” of a computer, executing instructions from software programs.
  • CPUs excel at sequential processing, where tasks are executed one after the other. They also support parallel processing to some extent through techniques like multi-core architectures and pipelining.
  • They manage tasks like operating system functions, application execution, and handling I/O operations.
  • CPUs are essential for general-purpose computing, including running applications, managing memory, and coordinating system resources.

Graphics Processing Units (GPUs), 

GPUs were initially associated primarily with gaming, and other applications like scientific simulations and data processing, however they now have transcended their original purpose as GPUs have evolved beyond just gaming. Let’s explore how this transformation occurred and why GPUs are now indispensable for various computing tasks beyond gaming.

  • GPUs are specialized hardware components designed for parallel processing.
  • Their architecture consists of thousands of cores, each capable of handling computations simultaneously.
  • Originally developed for graphics rendering (such as gaming), GPUs evolved to handle complex mathematical operations efficiently.
  • GPUs excel at tasks like matrix operations, image processing, and parallel algorithms.
  • In recent years, GPUs have become crucial for AI, machine learning, scientific simulations, and data-intensive workloads.

  1. Evolution of GPUs:

  2. Beyond Gaming: Diverse Applications:

    • Artificial Intelligence (AI) and Machine Learning:
      • GPUs play a pivotal role in training neural networks for AI and machine learning.
      • Their parallel architecture accelerates tasks like natural language processing and computer vision.
    • Data Science and Analytics:
      • GPUs handle massive datasets efficiently, reducing computation times for tasks like data preprocessing and statistical analysis.
    • High-Performance Computing (HPC):
      • Scientific research, weather forecasting, and simulations rely heavily on GPUs.
      • They excel in solving complex mathematical models with remarkable accuracy.
    • Medical Imaging and Research:
  3. The Trajectory of GPUs:


LLMs - words vs tokens

https://kelvin.legal/understanding-large-language-models-words-versus-tokens/#:~:text=The%20size%20of%20text%20an,the%20cost%20and%20vice%20versa.

Tokens can be thought of as pieces of words. Before the model processes the prompts, the input is broken down into tokens. These tokens are not cut up exactly where the words start or end - tokens can include trailing spaces and even sub-words. -- Llama.

The size of text an LLM can process and generate is measured in tokens. Additionally, the operational expense of LLMs is directly proportional to the number of tokens it processes - the fewer the tokens, the lower the cost and vice versa. 






Tokenizing language translates it into numbers – the format that computers can actually process. Using tokens instead of words enables LLMs to handle larger amounts of data and more complex language. By breaking words into smaller parts (tokens), LLMs can better handle new or unusual words by understanding their building blocks.


Agile in the age of AI - Henrik Kniberg

 https://hups-com.cdn.ampproject.org/c/s/hups.com/blog/agile-in-the-age-of-ai?hs_amp=true

Agile methodologies like Scrum are being impacted by the rise of AI. The traditional assumptions about team dynamics, roles, and development cycles are being challenged.

  • Cross-Functional Teams: AI's vast knowledge and productivity acceleration are reshaping the need for cross-functional teams. Smaller teams and more teams with AI assistance may become the norm.
  • Superteam: There is a possibility of super team, where these smaller teams will have a kind of standups to sync up, coordinate, and address dependencies and issues. Purpose and structure of these meetings will change from what they do now.
  • Changing Developer Roles: With AI's capability to generate code, developers may shift to decision-making and oversight roles, with AI handling much of the coding work.
  • Redefining Sprints: Agile sprints may become shorter or disappear as AI speeds up development cycles, making traditional timeboxing less relevant.
  • Specialists in Agile Teams: Specialists may become roaming or shared resources, complementing AI capabilities within smaller teams.
  • Evolution of Scrum Master Role: Scrum Masters may transition to coaches, guiding teams in effectively utilizing AI technologies.
  • User Feedback Loop: AI-driven mock users could supplement real user feedback, allowing for more frequent and immediate input in Agile development.
  • Additional Considerations: Various factors like 
    • Product backlog prioritization: product backlog will need to be updated frequently. PO will focus more on strategic prioritization and stakeholder management. 
    • Estimation methods: teams will need new ways to planning and forecasting. 
    • Framework adaptations: Popular Agile frameworks like Scrum, Kanban, or SAFe might need to be adapted to accommodate the changes brought by AI. 
    • Team dynamics: teams will require new ways to ensure human connection, creativity, and innovation in an AI-driven environment.
    • Continuous learning will become even more cruicial as AI keeps taking up larger share of what it can contribute. Team members may need to focus on developing new skills, such as prompt engineering, AI model selection, and result evaluation.
    • Ethical considerations need to be addressed in the AI-driven Agile landscape - biases, fairness and transparency. 
The Age of AI calls for a recalibration of Agile practices, with a focus on adapting to the new realities brought about by AI technologies.

Wednesday, April 17, 2024

Secure by Design

 Secure by Design (SBD) in the IT industry refers to an approach where security is integrated into the design phase of software, systems, or products rather than being added as an afterthought. The goal is to proactively identify and mitigate security risks throughout the development lifecycle rather than trying to patch vulnerabilities later.

The Secure by Design (SBD) Engineer works closely with the Project Manager (PM) from the project's outset. Together, they examine the architecture with a focus on security. If any vulnerabilities or risks are identified, the SBD Engineer provides recommendations to address them.

As the project progresses, and typically during the mid-stage of System Integration Testing (SIT) when major defects are resolved, the SBD Engineer requests a Fortify scan of the codebase before deployment. If the scan reveals no issues, the process continues smoothly. However, if vulnerabilities are found, the team undertakes code refactoring to address them. After refactoring, the SBD Engineer ensures that any changes do not affect the system's functionality through SIT regression testing.

This meticulous approach ensures that security is integrated into every phase of the project, ultimately resulting in a more resilient and secure IT system.

Structured Query Language (SQL)

Monday, April 15, 2024

NLP and LLM

Natural Language Processing and Large Language Models.

Natural Language Processing

NLP stands for Natural Language Processing. Imagine you're teaching a computer to understand and interact with human language, just like how ChatGPT and I are communicating right now. NLP involves developing algorithms and techniques to enable computers to understand, interpret, and generate human language in a way that's meaningful to us. It's what powers virtual assistants like Siri or Alexa, language translation services like Google Translate, and even spell checkers or autocomplete features in your smartphone keyboard.

NLP is the intersection of computer science, artificial intelligence, and linguists. 

For a computer to be able to process language, below steps are required:

Understanding Language Structure: At its core, NLP aims to teach computers how to understand the structure, meaning, and context of human language. This involves breaking down language into its fundamental components such as words, phrases, sentences, and paragraphs.

Tokenization: One of the initial steps in NLP is tokenization, where text is divided into smaller units called tokens. These tokens could be words, subwords, or characters, depending on the specific task and language being processed.

Syntax Analysis: NLP algorithms analyze the syntactic structure of sentences to understand the grammatical rules and relationships between words. Techniques like parsing help identify the subject, verb, object, and other parts of speech in a sentence.

Semantic Analysis: Beyond syntax, NLP also focuses on understanding the meaning of words and sentences. This involves techniques such as semantic parsing, word sense disambiguation, and semantic role labeling to extract the underlying semantics from text.

Named Entity Recognition (NER): NER is a crucial task in NLP where algorithms identify and classify entities such as names of people, organizations, locations, dates, and numerical expressions within text.

Sentiment Analysis: This branch of NLP involves determining the sentiment or emotion expressed in a piece of text. Sentiment analysis techniques range from simple polarity classification (positive, negative, neutral) to more nuanced approaches that detect emotions like joy, anger, sadness, etc.

Machine Translation: NLP plays a key role in machine translation systems like Google Translate, which translate text from one language to another. These systems employ techniques such as statistical machine translation or more modern neural machine translation models.

Question Answering Systems: NLP powers question answering systems like chatbots and virtual assistants. These systems understand user queries and generate appropriate responses by analyzing the semantics and context of the questions.

Text Generation: Another exciting area of NLP is text generation, where algorithms produce human-like text based on input prompts or contexts. Large language models, such as GPT (like the one you're talking to!), are capable of generating coherent and contextually relevant text across various domains.

NLP Success

NLP has seen remarkable success over the past few decades, with continuous advancements driven by research breakthroughs and technological innovations. Here are some key areas where NLP has made significant strides:

  1. Machine Translation: NLP has revolutionized the field of translation, making it possible for people to communicate seamlessly across language barriers. Systems like Google Translate employ sophisticated NLP techniques to provide reasonably accurate translations for a wide range of languages.
  2. Virtual Assistants and Chatbots: Virtual assistants such as Siri, Alexa, and Google Assistant have become integral parts of our daily lives, thanks to NLP. These systems understand and respond to spoken or typed queries, perform tasks like setting reminders, sending messages, and even provide personalized recommendations.
  3. Information Retrieval and Search Engines: NLP powers search engines like Google to understand user queries and return relevant search results. Techniques like natural language understanding help search engines interpret the user's intent and deliver more accurate results.
  4. Sentiment Analysis: NLP enables businesses to analyze large volumes of text data, such as customer reviews and social media posts, to gauge public sentiment towards products, services, or brands. Sentiment analysis tools help companies make informed decisions and improve customer satisfaction.
  5. Text Summarization and Extraction: NLP techniques are used to automatically summarize long documents or extract key information from unstructured text data. This is particularly useful in fields like news aggregation, document summarization, and information retrieval.
  6. Healthcare Applications: In healthcare, NLP is used for clinical documentation, medical record analysis, and extracting valuable insights from patient data. NLP-powered tools assist healthcare professionals in diagnosis, treatment planning, and medical research.
  7. Language Generation [LLM which is a subset of NLP]: Recent advancements in large language models (LLMs) have enabled machines to generate human-like text with impressive coherence and fluency. These models can write articles, generate code, compose music, and even engage in creative writing tasks.
  8. Accessibility Tools: NLP has contributed to the development of accessibility tools for individuals with disabilities, such as text-to-speech and speech-to-text systems, which enable people with visual or auditory impairments to interact with digital content more effectively.

Large Language Models (LLM)

        While NLP has been successful in many tasks, LLMs like GPT (Generative Pre-Trained Transformer) have addressed several limitations and brought about significant advancements in the field. Below are the reasons why LLMs were developed despite the success of NLP.
        1. Contextual Understanding: 
          1. Traditional NLP approaches often struggled with understanding context across longer pieces of text or in ambiguous situations. LLMs, on the other hand, leverage deep learning techniques to capture contextual dependencies effectively, enabling them to generate more coherent and contextually relevant text.    
        1. Scalability: 
        2. Transfer Learning: 
        3. Language Generation: 
        4. Data Efficiency: 
        5. Continual Learning: 



        SQL Essential Training - LinkedIn

        Datum - piece of information Data is plural of datum. Data are piece of information - text, images or video. Database - collection of data. ...