Thursday, June 27, 2024

DSPM, Data Security Posture Management, Data Observability

DATA SECURITY POSTURE MANAGEMENT

DSPM, or Data Security Posture Management, is a practice that involves assessing and managing the security status of data across an organization's IT environment. This concept is particularly relevant in the context of modern data management, where data is often distributed across multiple systems, platforms, and locations. DSPM aims to provide a comprehensive view of how data is handled, protected, and accessed, helping organizations to secure sensitive information and comply with data protection regulations.

"It involves the implementation of policies, practices, and technologies to ensure the confidentiality, integrity, and availability of sensitive information. The goal of DSPM is to establish and maintain a robust security posture that can effectively identify, protect, detect, respond to, and recover from potential security threats and incidents related to data." - Pulkit Duggal on LinkedIn.

Key components of DSPM:
  1. Data Discovery
    1. Objective: Identify where sensitive data resides within the organization.
    2. Techniques: Use tools to scan databases, cloud storage, and other repositories to map data assets.
  2. Data Classification
    1. Objective: Categorize data based on its sensitivity and importance.
    2. Techniques: Apply tags or labels to data to denote its classification (e.g., public, confidential, sensitive).
  3. Risk Assessment
    1. Objective: Evaluate the potential risks associated with the data.
    2. Techniques: Analyze vulnerabilities, access patterns, and potential threats to data security.
  4. Policy Management
    1. Objective: Define and enforce data security policies.
    2. Techniques: Implement rules and controls to govern data access, sharing, and storage.
  5. Access Controls & Encryption
    1. Objective: Ensure that only authorized users have access to sensitive data.
    2. Techniques: Use role-based access controls (RBAC), identity management, and least privilege principles.
  6. Monitoring & Alerting
    1. Objective: Continuously monitor data access and usage to detect anomalies.
    2. Techniques: Implement tools to track data activity and generate alerts for suspicious behavior.
  7. Incident Response
    1. Objective: Respond to data security incidents effectively.
    2. Techniques: Develop and implement response plans for data breaches or unauthorized access.
  8. Compliance Management / Auditing
    1. Objective: Ensure adherence to legal and regulatory requirements related to data security.
    2. Techniques: Map data security practices to frameworks like GDPR, CCPA, HIPAA, etc.
Benefits of DSPM
  • Enhanced Data Security: By continuously assessing and managing the security posture of data, organizations can better protect sensitive information.
  • Regulatory Compliance: Helps organizations comply with various data protection laws and regulations.
  • Risk Reduction: Identifies and mitigates risks associated with data breaches and unauthorized access.
  • Operational Efficiency: Streamlines data security processes and reduces the complexity of managing data across diverse environments.
DATA OBSERVABILITY

It refers to the comprehensive monitoring and analysis of data pipelines, systems, and processes to ensure data quality, reliability, and operational efficiency. It involves observing the health and behavior of data as it moves through various stages of processing and storage, allowing organizations to detect and address issues proactively.

Key Components of Data Observability

Data Quality Monitoring:

Objective: Ensure data accuracy, consistency, completeness, and timeliness.
Techniques: Use automated checks and validation rules to monitor for anomalies, missing values, or incorrect data.

Pipeline Health Monitoring:

Objective: Track the performance and reliability of data pipelines.
Techniques: Monitor metrics such as data latency, throughput, and failure rates to identify and resolve bottlenecks or errors.

End-to-End Visibility:

Objective: Provide a holistic view of data as it flows through the system.
Techniques: Implement tracing and logging to follow data lineage from source to destination.

Anomaly Detection:

Objective: Identify unusual patterns or behaviors in data that may indicate issues.
Techniques: Use statistical models, machine learning algorithms, and thresholds to detect outliers and anomalies.

Alerting and Notifications:

Objective: Provide real-time alerts for any detected issues or deviations.
Techniques: Configure alerts based on specific conditions or thresholds, and integrate with communication tools for immediate response.

Root Cause Analysis:

Objective: Diagnose the underlying causes of data issues.
Techniques: Use diagnostic tools and logs to trace problems back to their source, whether in data collection, processing, or storage.

Data Lineage and Dependency Tracking:

Objective: Understand how data moves through different systems and how changes impact downstream processes.
Techniques: Maintain a map of data dependencies and transformations to track lineage and assess the impact of modifications.

User and Application Behavior Monitoring:

Objective: Observe how users and applications interact with data.
Techniques: Analyze access patterns, query performance, and usage metrics to optimize performance and security.


Wednesday, June 26, 2024

What do Transformer models like ChatGPT really do and really lack?

What do Transformer models do

  • Pattern Recognition: Transformer models, like GPT-4, are excellent at recognizing and generating patterns in data. They use vast amounts of text to learn how words and concepts are typically related to one another.
  • Contextual Understanding: They excel at understanding context and generating coherent responses based on the input they receive. They use attention mechanisms to weigh the importance of different parts of the input text.
  • Predictive Capabilities: They predict the next word or phrase based on previous text, allowing them to generate text that is coherent and contextually relevant.

Limitations in Understanding

  1. No True Comprehension: Transformers don’t "understand" content in the way humans do. Their responses are based on statistical correlations rather than a deep comprehension of meaning or intent.

  2. Lack of World Knowledge: While they can mimic knowledge and understanding, they don’t possess a personal model of the world or experiences. Their knowledge is derived from the data they’ve been trained on and not from actual lived experience.

  3. Surface-Level Reasoning: Their reasoning is often surface-level and dependent on patterns seen in the training data. They can sometimes generate plausible but incorrect or nonsensical answers, particularly in complex or ambiguous situations.

  4. No Self-Awareness: Transformers lack self-awareness and consciousness. They don’t have personal beliefs, desires, or subjective experiences. They process information but don't experience it.

Practical Implications

  1. Useful for Many Tasks: Despite their limitations, transformer models are highly effective for a wide range of tasks such as language translation, text summarization, and conversational agents.

  2. Dependence on Data Quality: Their performance and the quality of their outputs heavily depend on the quality and scope of the data they have been trained on.

  3. Ethical Considerations: Their lack of true understanding raises important ethical considerations, particularly in terms of trust and the potential for misuse or misinterpretation.

Sunday, June 02, 2024

Google AI Essentials - Coursera

Google AI Essentials

CONTENTS
  1. Introduction to AI
  2. Maximize productivity with AI Tools
  3. Discover the art of Prompt Engineering
  4. Use AI Responsibly
  5. Stay Ahead of the AI curve. 

Introduction to AI

Intelligence: is the human ability to perform congnitive tasks.

Cognitive Task: is any mental activity such as thinking, understanding, learning, and remembering. Cognitive Abilities help humans in making decisions and solving problems. However, there is a limit to how much processing we can do at a time; AI helps extend our cognitive abilities. AI helps us to make better decisions and to solve problems faster.

Artificial Intelligence: computer programs that can complete cognitive tasks typically associated with human intelligence. 

AI asists us with tasks using Math to learn from Data.

AI Development Techniques

There are mainly two techniques used to design AI programs:

  1. Rules-based techniques: involve creating AI programs that strictly follow predefined rules to make decisions. For example, a spam filter using rule-based techniques might block emails that contain specific keywords using its predefined logic.
  2. Machine-learning techniques: involve creating AI programs that can analyze and learn from patterns in data to make independent decisions. For example, a spam filter using these techniques might flag potential spam for the recipient to review, preventing automatic blocking. If the recipient marks emails from trusted sources as safe, the spam filter learns and adapts its logic to include similar emails from that sender in the future.

How does AI use machine learning?

  • Recommendation systems (Youtube recommended videos for you) use AI.
  • AI Tool is an AI Powered software that can automate or assist users with a variety of user tasks.
  • AI systems examples - translations in real time, map software that suggest quick routes. 
  • AI systems are powered by ML / Machine Learning
Machine Learning is a subset of AI which is focused on developing computer programs that can analyze data to make predictions / decisions.

Training Set


AI Designers build ML Programs using a Training Set, which is a collection of data used to teach an AI about something

Example of a Training Set

A company uses AI to identify Ripe Apples from a set of ripe and unripe apples. For this, the program is trained on a large set of images of both unripe and ripe apples. Eventually, the program learns to identify ripe apples from unripe apples using the training data. 



Quality and Relevance of Training Data
  • ML are as good or as worse as the quality and relevance of their training data. 
  • A fundamental problem could be a bias within the training data. This can cause AI tool to produce inaccurate or biased / unintended outputs.
Bad training data below...


Approaches to training ML Programs [ML is a subset of AI]

  • Supervised learning: 
In this approach, the ML program learns from a labeled training set. A labeled training set includes data that is labeled or tagged, which provides context and meaning to the data. For instance, an email spam filter that's trained with supervised learning would use a training set of emails that are labeled as “spam” or “not spam.” Supervised learning is often used when there's a specific output in mind.
  • Unsupervised learning: 
In this approach, the ML program learns from an unlabeled training set. An unlabeled training set includes data that does not have labels or tags. For instance, ML might be used to analyze a dataset of unsorted  email messages and find patterns in topics, keywords, or contacts. In other words, unsupervised learning is used to identify patterns in data without a specific output in mind.
  • Reinforcement learning: 
In this approach, the ML program uses trial-and-error to learn which actions lead to the best outcome. The program learns to do this by getting rewarded for making good choices that lead to the desired results. Reinforcement learning is commonly used by conversational AI tools. As these tools receive feedback from users and AI designers, they learn to generate effective responses.

Each ML technique has its own strengths and weaknesses. Depending on the type of data that's available and what's needed to solve the particular problem, AI designers may use one, two, or all three of these techniques to produce an AI-powered solution.

Foundations of Generative AI

Generative AI is AI that can generative new content such as text, images, voice, video. You can use them with Natural Language (language that people use to communicate with each other).







Capabilities and Limitations of AI


1. Healthcare

Disease Diagnosis and Treatment: AI can analyze medical data to assist in diagnosing diseases and recommending treatments. Machine learning algorithms can detect patterns in medical imaging, genetic data, and patient records to identify conditions early.

Personalized Medicine: AI can help tailor treatments to individual patients based on their genetic makeup, lifestyle, and other factors, improving efficacy and reducing side effects.

Drug Discovery: AI can accelerate the discovery and development of new drugs by analyzing vast datasets to identify potential compounds and predict their effectiveness and safety.

2. Education

Personalized Learning: AI can adapt educational content to meet the needs of individual students, providing customized lessons, feedback, and assessments.

Automated Grading: AI can help teachers by automating the grading process, allowing them to focus more on student engagement and personalized instruction.

Tutoring and Support: AI-powered chatbots and virtual tutors can provide students with additional support and resources outside of the classroom.

3. Environmental Conservation

Climate Modeling: AI can enhance climate models, improving predictions about climate change and helping policymakers make informed decisions.

Wildlife Protection: AI can monitor wildlife populations using drone footage and sensor data, helping to combat poaching and manage conservation efforts.

Energy Efficiency: AI can optimize energy use in buildings, transportation, and industrial processes, reducing carbon footprints and promoting sustainability.

4. Social Good

Disaster Response: AI can analyze data from social media, sensors, and other sources to provide real-time information during natural disasters, aiding in response and recovery efforts.

Humanitarian Aid: AI can help distribute resources more effectively during crises by predicting needs and optimizing logistics.

Accessibility: AI can develop tools for people with disabilities, such as speech-to-text applications, navigation aids, and personalized assistive technologies.

5. Economic Development

Financial Inclusion: AI can provide financial services to underserved populations, enabling them to access credit, insurance, and other financial products.

Job Creation: While AI may displace certain jobs, it can also create new opportunities in tech development, maintenance, and various support roles.

Agriculture: AI can optimize farming practices by analyzing data on weather, soil, and crops, leading to increased yields and sustainable practices.

6. Governance and Public Services

Smart Cities: AI can enhance urban planning and management, improving traffic flow, reducing pollution, and enhancing public safety.

Fraud Detection: AI can detect and prevent fraudulent activities in public services, ensuring that resources are used effectively and reach those in need.

Policy Making: AI can analyze large datasets to inform evidence-based policy making, helping governments address complex issues more effectively.

7. Research and Innovation

Scientific Discovery: AI can process and analyze large volumes of research data, identifying new patterns and accelerating scientific breakthroughs.

Interdisciplinary Collaboration: AI can facilitate collaboration across different fields by identifying commonalities and enabling the sharing of knowledge and resources.

Ethical and Responsible AI

Bias Mitigation: Developing algorithms that are transparent and unbiased is crucial. Efforts to identify and mitigate biases in AI systems will ensure fair and equitable outcomes.

Privacy Protection: Ensuring that AI systems are designed with privacy in mind, using techniques such as differential privacy and federated learning, will protect individual data.

Regulation and Standards: Establishing clear regulations and standards for AI development and deployment will help ensure that AI technologies are used responsibly and ethically.

AI as a collaborative Tool

- AI Automation (For example frontdesk email automation)
- Press Release PR person in creating press releases for media.

1. Project Management and Coordination
  • Task Management: AI can automate task assignment based on team members' skills and availability, ensuring optimal resource allocation.
  • Timeline Prediction: Predict project timelines and identify potential delays by analyzing past project data.
  • Progress Tracking: Monitor project progress in real-time, providing updates and alerts for potential bottlenecks.
  • Meeting notes, automated.
2. Communication Enhancement
  • Language Translation: Real-time translation services for multilingual teams to facilitate clear communication.
  • Speech-to-Text: Convert meetings and discussions into text for easy reference and sharing.
  • Chatbots: Provide instant answers to common questions, facilitating smooth internal communication.
3. Data Analysis and Decision Making
  • Data Aggregation: Collect and consolidate data from various sources for comprehensive analysis.
  • Trend Analysis: Identify trends and patterns in data that can inform strategic decisions.
  • Predictive Analytics: Forecast future outcomes and suggest proactive measures based on historical data.
4. Content Creation and Collaboration
  • Document Generation: Automate the creation of reports, proposals, and other documents.
  • Real-time Editing: AI-powered tools like Grammarly or Copysmith for real-time editing and content improvement.
  • Version Control: Manage document versions and ensure all collaborators are working on the latest version.
5. Learning and Development
  • Personalized Learning: Tailor training programs to individual needs and learning styles using AI.
  • Knowledge Sharing: AI-driven knowledge bases that provide relevant information and resources to team members.
  • Performance Feedback: Continuous and automated feedback on employee performance and areas for improvement.
6. Customer Support and Engagement
  • Virtual Assistants: AI chatbots to handle customer inquiries, providing quick and accurate responses.
  • Sentiment Analysis: Analyze customer feedback to understand sentiments and improve service quality.
  • Automated Follow-ups: Schedule and send follow-up emails or messages to customers or team members.
7. Innovation and Idea Management
  • Brainstorming Assistance: AI tools to generate ideas or suggestions based on input criteria.
  • Trend Forecasting: Predict market trends and suggest innovative solutions.
  • Collaboration Platforms: AI-enhanced platforms like Slack or Microsoft Teams to facilitate idea sharing and collaboration.
8. Operations and Workflow Automation
  • Process Automation: Automate repetitive tasks and workflows to save time and reduce errors.
  • Resource Allocation: Optimize resource use by predicting needs and reallocating as necessary.
  • Supply Chain Management: AI-driven insights for better inventory management and logistics coordination.
9. Sales and Marketing
  • Lead Scoring: Identify and prioritize high-potential leads using AI analysis.
  • Customer Insights: Gain deeper insights into customer behavior and preferences.
  • Campaign Optimization: Automate and optimize marketing campaigns for better reach and engagement.
10. Human Resources
  • Recruitment: Use AI for resume screening, candidate matching, and interview scheduling.
  • Employee Engagement: Monitor and improve employee engagement through sentiment analysis and feedback loops.
  • Performance Management: Analyze employee performance data to provide personalized development plans.
11. Research and Development
  • Literature Review: Automate the review of scientific literature and patents to keep up with the latest developments.
  • Experimentation: Design and analyze experiments using AI to predict outcomes and optimize processes.
  • Collaborative Platforms: AI-powered platforms for sharing research data and findings among collaborators.
12. Health and Safety
  • Health Monitoring: AI tools to monitor employee health and well-being.
  • Safety Compliance: Ensure compliance with safety regulations through automated checks and alerts.
  • Incident Prediction: Predict potential safety incidents and suggest preventive measures.

Maximize Productivity using AI Tools

  • Brainstorm ideas.
  • Text generation for websites.
  • Support research on topics.
  • Translate content into multiple languages
  • Generate images - logo, branding, design visuals, etc.
  • Create audio, video for promotional purpposes.

Ways to use AI

  • Use as standalone AIs
    • Speeko: public speaking AI tutor
  • Integrated AI - Adobe Photoshop neurofilter - integrated AI into an existing desktop solution.
  • AI Tool chain
  • Or a custom solution

AI Models and the Training Process

AI Tool: is an AI-powered software that can automate or assist users with a variety of tasks. 
AI Model: is a computer program that is trained on sets of data to recognize patterns and perform specific tasks.


The car: An AI tool, like a car, gets you to a “destination,” such as a completed task or an output. And AI designers and engineers, just like auto engineers, add various features and controls into AI tools to provide a user-friendly experience.

The engine: An AI model is the underlying component that makes the “car” run. It's under the hood, you might say, processing user input and allowing you to drive the car.

Note: AI tools sometimes use multiple AI models in order to have more flexibility and  perform a wider range of tasks.

The process of training AI models

AI Designers and Engineers develop AI Models through a process called Training. Below is a process they use to create AI Models. Below model predicts rainfall.
  1. Define the problem to be solved. AI designers and engineers want to predict rain to help people stay dry when commuting to and from work. They start by considering AI’s capabilities and limitations before identifying an AI solution.
  2. Collect relevant data to train the model. AI designers and engineers gather historical data of days when it rained and days when it didn't rain over the past 50 years.
  3. Prepare the data for training. AI designers and engineers prepare the data by labeling important features, such as outdoor temperature, humidity, and air pressure, and then noting whether it rained. It's also common to separate the data into two distinct sets: a training set and a validati
  4. Train the model. AI designers and engineers apply machine learning (ML) programs to their rain prediction model, which helps it recognize patterns in its training data that indicate the likelihood of rainfall. Those patterns might include high temperatures, low air pressure, 
  5. Evaluate the model. AI designers and engineers use the validation set they prepared earlier to assess their model's ability to predict rainfall accurately and reliably. Analyzing a model's performance can uncover potential issues impacting the model, such as insufficient or bias
  6. Deploy the model. When the AI designers and engineers are satisfied with their model's performance, they deploy it in an AI tool—helping people in their city stay dry on their way to work!
Note:  AI designers and engineers should continuously monitor and collect feedback on their models, ensuring their models continue to perform reliably and to identify areas for improvement.

Example tools include:

  • Anthropic Claude

    • Description: Anthropic Claude can complete problem-solving tasks, like finding mathematical solutions, translating between languages, and summarizing long documents. 

    • Stand-alone or integrated: Stand-alone

  • Gemini

    • Description: Supercharge your creativity and productivity with Gemini. Chat to start writing, planning, learning and more with Google AI. 

    • Stand-alone or integrated: Both

  • Microsoft Copilot

    • Description: Integrated with Microsoft Edge, Microsoft Copilot can help with online searches to find information, compare products, and summarize web page content.

    • Stand-alone or integrated: Both

  • ChatGPT

    • Description: ChatGPT can generate ideas, plan schedules, debug code, and proofread text.

    • Stand-alone or integrated: Stand-alone

Productivity and writing assistants


AI productivity and writing assistants can help with workplace tasks. They might provide grammar or spelling suggestions, generate a summary of a long document, or solve problems. Here are some examples: 

  • Clockwise

    • Description: Clockwise is a calendar tool that learns users’ work habits to automatically schedule and manage calendar events.

    • Example industries: Consulting, technology, sales

    • Stand-alone or integrated: Stand-alone

  • Grammarly

    • Description: Grammarly is a writing assistant that can help users edit and write clear, concise text.

    • Example industries: Creative writing, education, marketing

    • Stand-alone or integrated: Stand-alone

  • Jasper

    • Description: Jasper is a writing assistant intended for marketing tasks, like drafting social media posts, emails, and landing page content.

    • Example industries: Copywriting, marketing, sales

    • Stand-alone or integrated: Stand-alone

  • NotebookLM

    • Description: NotebookLM integrates into document apps, like Google Docs, and helps summarize or ask specific questions about text, notes, and sources.

    • Example industries: Content writing, finance, sales

    • Stand-alone or integrated: Both

  • Notion AI

    • Description: Notion AI is a writing assistant built into Notion, a productivity and note-taking software tool.

    • Example industries: Development, marketing, product management, sales

    • Stand-alone or integrated: Integrated

  • AI by Zapier

    • Description: AI by Zapier is a built-in productivity tool that allows AI automation to be integrated with the apps and workflows already connected through Zapier.

    • Example industries: Engineering, marketing, project management, technology

    • Stand-alone or integrated: Integrated

Code-generative AI tools


Code-generating tools can help generate, edit, or complete code for a variety of programming tasks in many different programming languages. Examples include:

  • Android Studio Bot

    • Description: Built into Android Studio, Studio Bot can generate code and answer questions about Android development.

    • Example industries: Data science, software development, web development

    • Stand-alone or integrated: Integrated

  • GitHub Copilot

    • Description: Built into GitHub, Copilot can write and suggest code, suggest descriptions for pull requests, translate multiple languages into code, and index repositories.

    • Example industries: Data science, software development, web development

    • Stand-alone or integrated: Both

  • Replit AI

    • Description: This tool, built into Replit, is a cloud-based Integrated Development Environment (IDE) for programmers that can make suggestions, help explain code, and turn natural language into code.

    • Example industries: Data science, software development, web development

    • Stand-alone or integrated: Integrated

  • Tabnine

    • Description: Tabnine can be a plugin to many popular code editors to help speed up delivery and keep code safe.

    • Example industries: Data science, software development, web development

    • Stand-alone or integrated: Stand-alone

  • Jupyter AI

    • Description: Jupyter is an open-source platform for coding, and this built-in tool includes a chat interface, which can be used to generate code, fix coding errors, and ask questions about files.

    • Example industries: Data science, software development, web development

    • Stand-alone or integrated: Integrated

Image and media generative AI tools


Media-generating AI tools help workers with tasks like generating and editing images, video, and speech. Examples include:


  • Adobe Firefly

    • Description: Built into the Adobe suite, Firefly can generate and edit images.

    • Example industries: Design, education, marketing

    • Stand-alone or integrated: Integrated

  • Canva Magic Design™ 

    • Description: Canva Magic Design is a tool that generates text and image content in Canva, an online graphic design tool.

    • Example industries: Design, education, marketing

    • Stand-alone or integrated: Integrated

  • DALL-E

    • Description: Integrated with ChatGPT, DALL-E generates images from text prompts.

    • Example industries: Design, education, marketing

    • Stand-alone or integrated: Integrated

  • ElevenLabs

    • Description: ElevenLabs is a speech AI tool that can generate spoken voice-over audio from text in different languages.

    • Example industries: Content creation, education, marketing, production

    • Stand-alone or integrated: Stand-alone

  • Google Ads

    • Description: Google Ads helps businesses reach customers around the world, driving growth and performance. Google Ads makes it easy to create campaigns, measure impact and improve your results. Put Google AI to work for your business with the
      Google Ads AI Essentials
      . Learn more with the
      AI Explored video series
      .

    • Example industries: Marketing, Advertising

    • Stand-alone or integrated: Integrated

  • Midjourney

    • Description: Integrated into Discord, Midjourney can generate images from text prompts.

    • Example industries: Design, education, marketing

    • Stand-alone or integrated: Integrated

  • Runway

    • Description: Runway can generate a new video from a text prompt or edit an existing video’s style or focus area, and remove people or other elements.

    • Example industries: Content creation, design, marketing, production

    • Stand-alone or integrated: Stand-alone

Importance of Human involvement in AI 


- Remove bias
- Remove hallucations / wrong output. (Hallucinations can be misleading, however in certain cases as generating images, they can still be beneficial)




Determining if Generative AI is right for the task on hand


Example where the above questions / use of Generative AI is not feasible. Suppose you want to negotiate with the local suppliers to get the best price for ingridents you want to use in your restaurant. This task is not Generative. It requires communications and relationship building where AI is not possible.

Another example

Another example

When is Generative AI not right?

  • Tasks requiring specialized knowledge - For instance, a restaurateur wants to use an AI tool to draft a lease agreement for their new storefront. While the tool can generate text, it might lack the legal expertise to handle specific clauses and regulations, potentially leading to inaccurate or incomplete portions of the lease.
  • Tasks requiring knowledge of personal preferences - Consider a training manager who's using an AI tool to create a customized lesson plan for a new onboarding workshop that caters to the needs of new employees. Without an understanding of each employee's roles or learning styles, the AI tool won't be able to produce an effective lesson plan.
  • Tasks requiring information beyond the last training date - Tasks that require information beyond the AI's last training date cannot be reliably addressed by generative AI tools. This is due to knowledge cutoff, the concept that an AI model is trained at a specific point in time, so it doesn’t have any knowledge of events or information after that date. For example, a business owner who's preparing their 2025 tax documents might ask a generative AI tool to summarize the newest tax deductions available in 2024. However, if the AI tool was last trained in 2023, it won't produce a useful output because it lacks the information needed to complete this task.

Prompt Engineering and LLM

A prompt is text input that provides instructions to the AI model on how to generate output. 


  • LLMs are trained on massive data comprising books, articles, websites and more. This training helps models establish patterns and relationships that exist in human language. 
  • They can also predict what word is likely to come next in sequence in a sentence.

LLMs can compute the probability of word that comes next - wet has the highest probability, whereas damp has lower, whereas dry has much less probability.

LLMs use statistics to analyze the relationships between all words in a given sequence  and compute the probabilities for thousands of possible words to come next in that sequence. This predictive power enables LLMs to respond to questions and requests 

LLMs may vary its response to the same prompt each time you ask. 

Key principles of effective prompts / Prompt Engineering

  • Use verb to tell what exactly to do -- for example, Create an outline of an article.... Edit the language for non technical audiene, etc.
  • Provide context.
You might use an LLM at work to help boost your productivity and creativity and complete any of these useful tasks: 

  • Iterate for best results. 



Zero shot - no examples
One-shot prompting - one example is provided
Few-shot - one ore more examples

Recognize Data Bias and AI Harms

  • Model is as good as the data it receives. 
  • Systemic bias exists in social systems. 
  • The data in these systems may already be biased since humans are influenced by systemic biases. 




  • More the data represents a wider variety of people, the more inclusive the outcome of the image generation will be. 
  • AI are value-laden / it is not intrincically value neutral, therefore requires critical thinking when applying them. 
    • AI models reflect biases of the data they were trained on.
    • AI models also reflect the values of the people who designed them.

AI Harms

    • Example: If a property manager for an apartment complex were to use an AI tool that conducted background checks to screen applications for potential tenants, the AI tool might misidentify an applicant and deem them a risk because of a low credit score. They might be denied an apartment and lose the application fee.


If AI doesn't provide information to all, some people may be denied opportunites.


    • Example: When speech-recognition technology was first developed, the training data didn’t have many examples of speech patterns exhibited by people with disabilities, so the devices often struggled to parse this type of speech.

 
- People with disabilities

    • Example: When translation technology was first developed, certain outputs would inaccurately skew masculine or feminine. For example, when generating a translation for words like “nurse” and “beautiful,” the translation would skew feminine. When words like “doctor” and “strong” were used as inputs, the translation would skew masculine.

 - Feminine / masculine speech....

-- example is deepfakes

    • Example: If someone were able to take control over an in-home device at their previous apartment to play an unwanted prank on their former roommate, these actions could result in a loss of sense of self and agency by the person affected by the prank.






Drift

Drift is the decline in an AI model's accuracy in predictions due to changes over time that aren’t reflected in the training data. For instance, a fashion designer might want to track trends in spending before creating a new collection. To begin tracking, they use a model built in 2015 that was trained on fashion trends and consumer habits from 2015. However, the model is no longer accurate because societal habits and fashion trends change over time. Consumer preferences in 2015 are different from today’s trends. In other words, the model’s predictions have drifted from accurate at the time of training to less accurate in the present day.

Knowledge Cutoff

A knowledge cutoff is the concept that a model is trained at a specific point in time, so it doesn’t have any knowledge of events or information after that date. For example, if you ask a generative AI tool that was trained in 2022 how much the latest smartphone costs, the model’s output won’t include today’s newest technology—you’ll only learn about smartphones that existed in 2022. So, if a model’s data isn’t up to date, the output won’t be either.

Checklist for using AI responsibly

  1. Identify potential harms by prompting the AI tool with different examples and checking the outputs.

  1. Test the tool on topics you’re familiar with, so you can verify outputs with your own knowledge.

  2. To minimize the effects of hallucinations:

    1. Always make sure your prompt provides context, includes an example, and states a request. 

    2. Avoid using a false premise in your input. Make sure your prompt is clear, specific, and accurate.

  3. Tell your audience and anyone it might affect that you’ve used or are using AI. This step is particularly important when using AI in high-impact professional settings, where there are risks involved in the outcome of AI. Explain what type of tool you used, describe your intention, provide an overview of your use of AI, and offer any other information that could help your audience evaluate potential risks. Don’t copy and paste outputs generated by AI and pass them off as your own.
  4. Fact check content accuracy using search engines. 
  5. Ask yourself: If this content turns out to be inaccurate or untrue, am I willing or able to correct my mistake? If you aren’t, that’s probably an indicator that you shouldn’t share it. 
  6. Only input essential information. Don’t provide any information that’s unnecessary, confidential, or private, because you may threaten the security of a person or the organization you’re working for. 

  7. Read supporting documents associated with the tools you’re using. Any documentation that describes how the model was trained to use privacy safeguards (such as terms and conditions) can be a helpful resource for you.

  8. If I use AI for this particular task, will it hurt anyone around me? 

  9. Does it reinforce or uphold biases that may cause damage to any groups of people?




DSPM, Data Security Posture Management, Data Observability

DATA SECURITY POSTURE MANAGEMENT DSPM, or Data Security Posture Management, is a practice that involves assessing and managing the security ...