Complete AI Engineer Roadmap (#7508)

* ai eng content

* 57 topics

* 44 topics

* 68 topics, need to add links to the final 15 or so

* final topics

* update copy and links

* Update ai-engineer-vs-ml-engineer@jSZ1LhPdhlkW-9QJhIvFs.md

Co-authored-by: Kamran Ahmed <kamranahmed.se@gmail.com>

* Update code-completion-tools@TifVhqFm1zXNssA8QR3SM.md

Co-authored-by: Kamran Ahmed <kamranahmed.se@gmail.com>

* Update development-tools@NYge7PNtfI-y6QWefXJ4d.md

Co-authored-by: Kamran Ahmed <kamranahmed.se@gmail.com>

* Update introduction@_hYN0gEi9BL24nptEtXWU.md

Co-authored-by: Kamran Ahmed <kamranahmed.se@gmail.com>

* Update what-is-an-ai-engineer@GN6SnI7RXIeW8JeD-qORW.md

Co-authored-by: Kamran Ahmed <kamranahmed.se@gmail.com>

* resolve comments

* Update src/data/roadmaps/ai-engineer/content/image-understanding@fzVq4hGoa2gdbIzoyY1Zp.md

* Update src/data/roadmaps/ai-engineer/content/anomaly-detection@AglWJ7gb9rTT2rMkstxtk.md

* Update src/data/roadmaps/ai-engineer/content/chunking@mX987wiZF7p3V_gExrPeX.md

* Update src/data/roadmaps/ai-engineer/content/data-classification@06Xta-OqSci05nV2QMFdF.md

* Update src/data/roadmaps/ai-engineer/content/inference@KWjD4xEPhOOYS51dvRLd2.md

* Update src/data/roadmaps/ai-engineer/content/manual-implementation@6xaRB34_g0HGt-y1dGYXR.md

* Update src/data/roadmaps/ai-engineer/content/mongodb-atlas@j6bkm0VUgLkHdMDDJFiMC.md

* Update src/data/roadmaps/ai-engineer/content/video-understanding@TxaZCtTCTUfwCxAJ2pmND.md

* Update src/data/roadmaps/ai-engineer/content/performing-similarity-search@ZcbRPtgaptqKqWBgRrEBU.md

* Update src/data/roadmaps/ai-engineer/content/popular-open-source-models@97eu-XxYUH9pYbD_KjAtA.md

---------

Co-authored-by: Kamran Ahmed <kamranahmed.se@gmail.com>
pull/7532/head
dsh 1 month ago committed by GitHub
parent ee3736bd60
commit ceeaa91f62
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
  1. 2
      src/data/roadmaps/ai-engineer/content/agents-usecases@778HsQzTuJ_3c9OSn5DmH.md
  2. 8
      src/data/roadmaps/ai-engineer/content/ai-agents@AeHkNU-uJ_gBdo5-xdpEu.md
  3. 9
      src/data/roadmaps/ai-engineer/content/ai-engineer-vs-ml-engineer@jSZ1LhPdhlkW-9QJhIvFs.md
  4. 7
      src/data/roadmaps/ai-engineer/content/ai-safety-and-ethics@8ndKHDJgL_gYwaXC7XMer.md
  5. 7
      src/data/roadmaps/ai-engineer/content/ai-vs-agi@5QdihE1lLpMc3DFrGy46M.md
  6. 6
      src/data/roadmaps/ai-engineer/content/anomaly-detection@AglWJ7gb9rTT2rMkstxtk.md
  7. 7
      src/data/roadmaps/ai-engineer/content/anthropics-claude@hy6EyKiNxk1x84J63dhez.md
  8. 7
      src/data/roadmaps/ai-engineer/content/audio-processing@mxQYB820447DC6kogyZIL.md
  9. 9
      src/data/roadmaps/ai-engineer/content/aws-sagemaker@OkYO-aSPiuVYuLXHswBCn.md
  10. 7
      src/data/roadmaps/ai-engineer/content/azure-ai@3PQVZbcr4neNMRr6CuNzS.md
  11. 7
      src/data/roadmaps/ai-engineer/content/benefits-of-pre-trained-models@1Ga6DbOPc6Crz7ilsZMYy.md
  12. 10
      src/data/roadmaps/ai-engineer/content/bias-and-fareness@lhIU0ulpvDAn1Xc3ooYz_.md
  13. 7
      src/data/roadmaps/ai-engineer/content/capabilities--context-length@vvpYkmycH0_W030E-L12f.md
  14. 7
      src/data/roadmaps/ai-engineer/content/chat-completions-api@_bPTciEA1GT1JwfXim19z.md
  15. 8
      src/data/roadmaps/ai-engineer/content/chunking@mX987wiZF7p3V_gExrPeX.md
  16. 9
      src/data/roadmaps/ai-engineer/content/code-completion-tools@TifVhqFm1zXNssA8QR3SM.md
  17. 7
      src/data/roadmaps/ai-engineer/content/cohere@a7qsvoauFe5u953I699ps.md
  18. 7
      src/data/roadmaps/ai-engineer/content/conducting-adversarial-testing@Pt-AJmSJrOxKvolb5_HEv.md
  19. 7
      src/data/roadmaps/ai-engineer/content/constraining-outputs-and-inputs@ONLDyczNacGVZGojYyJrU.md
  20. 7
      src/data/roadmaps/ai-engineer/content/cut-off-dates--knowledge@LbB2PeytxRSuU07Bk0KlJ.md
  21. 7
      src/data/roadmaps/ai-engineer/content/dall-e-api@LKFwwjtcawJ4Z12X102Cb.md
  22. 6
      src/data/roadmaps/ai-engineer/content/data-classification@06Xta-OqSci05nV2QMFdF.md
  23. 7
      src/data/roadmaps/ai-engineer/content/development-tools@NYge7PNtfI-y6QWefXJ4d.md
  24. 7
      src/data/roadmaps/ai-engineer/content/embedding@grTcbzT7jKk_sIUwOTZTD.md
  25. 8
      src/data/roadmaps/ai-engineer/content/embeddings@XyEp6jnBSpCxMGwALnYfT.md
  26. 8
      src/data/roadmaps/ai-engineer/content/faiss@JurLbOO1Z8r6C3yUqRNwf.md
  27. 7
      src/data/roadmaps/ai-engineer/content/fine-tuning@15XOFdVp0IC-kLYPXUJWh.md
  28. 7
      src/data/roadmaps/ai-engineer/content/generation@2jJnS9vRYhaS69d6OxrMh.md
  29. 7
      src/data/roadmaps/ai-engineer/content/googles-gemini@oe8E6ZIQWuYvHVbYJHUc1.md
  30. 6
      src/data/roadmaps/ai-engineer/content/hugging-face-models@8XjkRqHOdyH-DbXHYiBEt.md
  31. 7
      src/data/roadmaps/ai-engineer/content/hugging-face-models@EIDbwbdolR_qsNKVDla6V.md
  32. 7
      src/data/roadmaps/ai-engineer/content/hugging-face@v99C5Bml2a6148LCJ9gy9.md
  33. 8
      src/data/roadmaps/ai-engineer/content/image-generation@49BWxYVFpIgZCCqsikH7l.md
  34. 6
      src/data/roadmaps/ai-engineer/content/image-understanding@fzVq4hGoa2gdbIzoyY1Zp.md
  35. 7
      src/data/roadmaps/ai-engineer/content/impact-on-product-development@qJVgKe9uBvXc-YPfvX_Y7.md
  36. 7
      src/data/roadmaps/ai-engineer/content/indexing-embeddings@5TQnO9B4_LTHwqjI7iHB1.md
  37. 8
      src/data/roadmaps/ai-engineer/content/inference@KWjD4xEPhOOYS51dvRLd2.md
  38. 2
      src/data/roadmaps/ai-engineer/content/introduction@_hYN0gEi9BL24nptEtXWU.md
  39. 6
      src/data/roadmaps/ai-engineer/content/know-your-customers--usecases@t1SObMWkDZ1cKqNNlcd9L.md
  40. 7
      src/data/roadmaps/ai-engineer/content/lancedb@rjaCNT3Li45kwu2gXckke.md
  41. 7
      src/data/roadmaps/ai-engineer/content/langchain-for-multimodal-apps@j9zD3pHysB1CBhLfLjhpD.md
  42. 7
      src/data/roadmaps/ai-engineer/content/langchain@ebXXEhNRROjbbof-Gym4p.md
  43. 7
      src/data/roadmaps/ai-engineer/content/limitations-and-considerations@MXqbQGhNM3xpXlMC2ib_6.md
  44. 9
      src/data/roadmaps/ai-engineer/content/llama-index@d0ontCII8KI8wfP-8Y45R.md
  45. 9
      src/data/roadmaps/ai-engineer/content/llamaindex-for-multimodal-apps@akQTCKuPRRelj2GORqvsh.md
  46. 10
      src/data/roadmaps/ai-engineer/content/llms@wf2BSyUekr1S1q6l8kyq6.md
  47. 7
      src/data/roadmaps/ai-engineer/content/manual-implementation@6xaRB34_g0HGt-y1dGYXR.md
  48. 7
      src/data/roadmaps/ai-engineer/content/maximum-tokens@qzvp6YxWDiGakA2mtspfh.md
  49. 7
      src/data/roadmaps/ai-engineer/content/mistral-ai@n-Ud2dXkqIzK37jlKItN4.md
  50. 6
      src/data/roadmaps/ai-engineer/content/mongodb-atlas@j6bkm0VUgLkHdMDDJFiMC.md
  51. 6
      src/data/roadmaps/ai-engineer/content/multimodal-ai-usecases@sGR9qcro68KrzM8qWxcH8.md
  52. 8
      src/data/roadmaps/ai-engineer/content/multimodal-ai@W7cKPt_UxcUgwp8J6hS4p.md
  53. 4
      src/data/roadmaps/ai-engineer/content/ollama-models@ro3vY_sp6xMQ-hfzO-rc1.md
  54. 6
      src/data/roadmaps/ai-engineer/content/ollama@rTT2UnvqFO3GH6ThPLEjO.md
  55. 9
      src/data/roadmaps/ai-engineer/content/open-ai-assistant-api@eOqCBgBTKM8CmY3nsWjre.md
  56. 9
      src/data/roadmaps/ai-engineer/content/open-ai-embedding-models@y0qD5Kb4Pf-ymIwW-tvhX.md
  57. 8
      src/data/roadmaps/ai-engineer/content/open-ai-embeddings-api@l6priWeJhbdUD5tJ7uHyG.md
  58. 9
      src/data/roadmaps/ai-engineer/content/open-ai-models@2WbVpRLqwi3Oeqk1JPui4.md
  59. 9
      src/data/roadmaps/ai-engineer/content/open-ai-playground@nyBgEHvUhwF-NANMwkRJW.md
  60. 2
      src/data/roadmaps/ai-engineer/content/open-source-embeddings@apVYIV4EyejPft25oAvdI.md
  61. 6
      src/data/roadmaps/ai-engineer/content/open-vs-closed-source-models@RBwGsq9DngUsl8PrrCbqx.md
  62. 2
      src/data/roadmaps/ai-engineer/content/openai-api@zdeuA4GbdBl2DwKgiOA4G.md
  63. 7
      src/data/roadmaps/ai-engineer/content/openai-assistant-api@mbp2NoL-VZ5hZIIblNBXt.md
  64. 7
      src/data/roadmaps/ai-engineer/content/openai-functions--tools@Sm0Ne5Nx72hcZCdAcC0C2.md
  65. 6
      src/data/roadmaps/ai-engineer/content/openai-models@5ShWZl1QUqPwO-NRGN85V.md
  66. 7
      src/data/roadmaps/ai-engineer/content/openai-moderation-api@ljZLa3yjQpegiZWwtnn_q.md
  67. 7
      src/data/roadmaps/ai-engineer/content/openai-vision-api@CRrqa-dBw1LlOwVbrZhjK.md
  68. 6
      src/data/roadmaps/ai-engineer/content/opensource-ai@a_3SabylVqzzOyw3tZN5f.md
  69. 2
      src/data/roadmaps/ai-engineer/content/performing-similarity-search@ZcbRPtgaptqKqWBgRrEBU.md
  70. 8
      src/data/roadmaps/ai-engineer/content/pinecone@_Cf7S1DCvX7p1_3-tP3C3.md
  71. 6
      src/data/roadmaps/ai-engineer/content/popular-open-source-models@97eu-XxYUH9pYbD_KjAtA.md
  72. 4
      src/data/roadmaps/ai-engineer/content/pricing-considerations@4GArjDYipit4SLqKZAWDf.md
  73. 4
      src/data/roadmaps/ai-engineer/content/pricing-considerations@DZPM9zjCbYYWBPLmQImxQ.md
  74. 7
      src/data/roadmaps/ai-engineer/content/prompt-engineering@Dc15ayFlzqMF24RqIF_-X.md
  75. 7
      src/data/roadmaps/ai-engineer/content/prompt-injection-attacks@cUyLT6ctYQ1pgmodCKREq.md
  76. 7
      src/data/roadmaps/ai-engineer/content/purpose-and-functionality@WcjX6p-V-Rdd77EL8Ega9.md
  77. 8
      src/data/roadmaps/ai-engineer/content/qdrant@DwOAL5mOBgBiw-EQpAzQl.md
  78. 2
      src/data/roadmaps/ai-engineer/content/rag--implementation@lVhWhZGR558O-ljHobxIi.md
  79. 8
      src/data/roadmaps/ai-engineer/content/rag-usecases@GCn4LGNEtPI0NWYAZCRE-.md
  80. 8
      src/data/roadmaps/ai-engineer/content/rag-vs-fine-tuning@qlBEXrbV88e_wAGRwO9hW.md
  81. 8
      src/data/roadmaps/ai-engineer/content/rag@9JwWIK0Z2MK8-6EQQJsCO.md
  82. 7
      src/data/roadmaps/ai-engineer/content/react-prompting@voDKcKvXtyLzeZdx2g3Qn.md
  83. 7
      src/data/roadmaps/ai-engineer/content/recommendation-systems@HQe9GKy3p0kTUPxojIfSF.md
  84. 7
      src/data/roadmaps/ai-engineer/content/replicate@c0RPhpD00VIUgF4HJgN2T.md
  85. 7
      src/data/roadmaps/ai-engineer/content/retrieval-process@OCGCzHQM2LQyUWmiqe6E0.md
  86. 7
      src/data/roadmaps/ai-engineer/content/robust-prompt-engineering@qmx6OHqx4_0JXVIv8dASp.md
  87. 7
      src/data/roadmaps/ai-engineer/content/roles-and-responsiblities@K9EiuFgPBFgeRxY4wxAmb.md
  88. 7
      src/data/roadmaps/ai-engineer/content/security-and-privacy-concerns@sWBT-j2cRuFqRFYtV_5TK.md
  89. 7
      src/data/roadmaps/ai-engineer/content/semantic-search@eMfcyBxnMY_l_5-8eg6sD.md
  90. 8
      src/data/roadmaps/ai-engineer/content/sentence-transformers@ZV_V6sqOnRodgaw4mzokC.md
  91. 8
      src/data/roadmaps/ai-engineer/content/speech-to-text@jQX10XKd_QM5wdQweEkVJ.md
  92. 7
      src/data/roadmaps/ai-engineer/content/supabase@9kT7EEQsbeD2WDdN9ADx7.md
  93. 7
      src/data/roadmaps/ai-engineer/content/text-to-speech@GCERpLz5BcRtWPpv-asUz.md
  94. 7
      src/data/roadmaps/ai-engineer/content/token-counting@FjV3oD7G2Ocq5HhUC17iH.md
  95. 8
      src/data/roadmaps/ai-engineer/content/training@xostGgoaYkqMO28iN2gx8.md
  96. 5
      src/data/roadmaps/ai-engineer/content/transformersjs@bGLrbpxKgENe2xS1eQtdh.md
  97. 8
      src/data/roadmaps/ai-engineer/content/using-sdks-directly@WZVW8FQu6LyspSKm1C_sl.md
  98. 7
      src/data/roadmaps/ai-engineer/content/vector-database@zZA1FBhf1y4kCoUZ-hM4H.md
  99. 7
      src/data/roadmaps/ai-engineer/content/vector-databases@LnQ2AatMWpExUHcZhDIPd.md
  100. 2
      src/data/roadmaps/ai-engineer/content/vector-databases@tt9u3oFlsjEMfPyojuqpc.md
  101. Some files were not shown because too many files have changed in this diff Show More

@ -4,6 +4,6 @@ AI Agents have a variety of usecases ranging from customer support, workflow aut
Visit the following resources to learn more:
- [@article@Top 15 Use Cases Of AI Agents In Business](https://www.ampcome.com/post/15-use-cases-of-ai-agents-in-business)
-[@article@Top 15 Use Cases Of AI Agents In Business](https://www.ampcome.com/post/15-use-cases-of-ai-agents-in-business)
-[@article@A Brief Guide on AI Agents: Benefits and Use Cases](https://www.codica.com/blog/brief-guide-on-ai-agents/)
-[@video@The Complete Guide to Building AI Agents for Beginners](https://youtu.be/MOyl58VF2ak?si=-QjRD_5y3iViprJX)

@ -1 +1,9 @@
# AI Agents
In AI engineering, "agents" refer to autonomous systems or components that can perceive their environment, make decisions, and take actions to achieve specific goals. Agents often interact with external systems, users, or other agents to carry out complex tasks. They can vary in complexity, from simple rule-based bots to sophisticated AI-powered agents that leverage machine learning models, natural language processing, and reinforcement learning.
Visit the following resources to learn more:
-[@article@Building an AI Agent Tutorial - LangChain](https://python.langchain.com/docs/tutorials/agents/)
-[@article@Ai agents and their types](https://play.ht/blog/ai-agents-use-cases/)
-[@video@The Complete Guide to Building AI Agents for Beginners](https://youtu.be/MOyl58VF2ak?si=-QjRD_5y3iViprJX)

@ -1,8 +1,9 @@
# AI Engineer vs ML Engineer
An AI Engineer develops broad AI solutions, such as chatbots, NLP, and intelligent automation, focusing on integrating AI technologies into large applications. In contrast, an ML Engineer is more focused on building and deploying machine learning models, handling data processing, model training, and optimization in production environments.
An AI Engineer uses pre-trained models and existing AI tools to improve user experiences. They focus on applying AI in practical ways, without building models from scratch. This is different from AI Researchers and ML Engineers, who focus more on creating new models or developing AI theory.
Visit the following resources to learn more:
Learn more from the following resources:
- [@article@AI Engineer vs. ML Engineer: Duties, Skills, and Qualifications](https://www.upwork.com/resources/ai-engineer-vs-ml-engineer)
- [@video@AI Developer vs ML Engineer: What’s the difference?](https://www.youtube.com/watch?v=yU87V2-XisA&t=2s)
- [@article@What does an AI Engineer do?](https://www.codecademy.com/resources/blog/what-does-an-ai-engineer-do/)
- [@article@What is an ML Engineer?](https://www.coursera.org/articles/what-is-machine-learning-engineer)
- [@video@AI vs ML](https://www.youtube.com/watch?v=4RixMPF4xis)

@ -1 +1,8 @@
# AI Safety and Ethics
AI safety and ethics involve establishing guidelines and best practices to ensure that artificial intelligence systems are developed, deployed, and used in a manner that prioritizes human well-being, fairness, and transparency. This includes addressing risks such as bias, privacy violations, unintended consequences, and ensuring that AI operates reliably and predictably, even in complex environments. Ethical considerations focus on promoting accountability, avoiding discrimination, and aligning AI systems with human values and societal norms. Frameworks like explainability, human-in-the-loop design, and robust monitoring are often used to build systems that not only achieve technical objectives but also uphold ethical standards and mitigate potential harms.
Learn more from the following resources:
- [@video@What is AI Ethics?](https://www.youtube.com/watch?v=aGwYtUzMQUk)
- [@article@Understanding artificial intelligence ethics and safety](https://www.turing.ac.uk/news/publications/understanding-artificial-intelligence-ethics-and-safety)

@ -1 +1,8 @@
# AI vs AGI
AI (Artificial Intelligence) refers to systems designed to perform specific tasks by mimicking aspects of human intelligence, such as pattern recognition, decision-making, and language processing. These systems, known as "narrow AI," are highly specialized, excelling in defined areas like image classification or recommendation algorithms but lacking broader cognitive abilities. In contrast, AGI (Artificial General Intelligence) represents a theoretical form of intelligence that possesses the ability to understand, learn, and apply knowledge across a wide range of tasks at a human-like level. AGI would have the capacity for abstract thinking, reasoning, and adaptability similar to human cognitive abilities, making it far more versatile than today’s AI systems. While current AI technology is powerful, AGI remains a distant goal and presents complex challenges in safety, ethics, and technical feasibility.
Learn more from the following resources:
- [@article@What is AGI?](https://aws.amazon.com/what-is/artificial-general-intelligence/)
- [@article@The crucial difference between AI and AGI](https://www.forbes.com/sites/bernardmarr/2024/05/20/the-crucial-difference-between-ai-and-agi/)

@ -1 +1,7 @@
# Anomaly Detection
Anomaly detection with embeddings works by transforming data, such as text, images, or time-series data, into vector representations that capture their patterns and relationships. In this high-dimensional space, similar data points are positioned close together, while anomalies stand out as those that deviate significantly from the typical distribution. This approach is highly effective for detecting outliers in tasks like fraud detection, network security, and quality control.
Learn more from the following resources:
- [@article@Anomoly in Embeddings](https://ai.google.dev/gemini-api/tutorials/anomaly_detection)

@ -1 +1,8 @@
# Anthropic's Claude
Anthropic's Claude is an AI language model designed to facilitate safe and scalable AI systems. Named after Claude Shannon, the father of information theory, Claude focuses on responsible AI use, emphasizing safety, alignment with human intentions, and minimizing harmful outputs. Built as a competitor to models like OpenAI's GPT, Claude is designed to handle natural language tasks such as generating text, answering questions, and supporting conversations, with a strong focus on aligning AI behavior with user goals while maintaining transparency and avoiding harmful biases.
Learn more from the following resources:
- [@official@Claude Website](https://claude.ai)
- [@video@How To Use Claude Pro For Beginners](https://www.youtube.com/watch?v=J3X_JWQkvo8)

@ -1 +1,8 @@
# Audio Processing
Audio processing in multimodal AI enables a wide range of use cases by combining sound with other data types, such as text, images, or video, to create more context-aware systems. Use cases include speech recognition paired with real-time transcription and visual analysis in meetings or video conferencing tools, voice-controlled virtual assistants that can interpret commands in conjunction with on-screen visuals, and multimedia content analysis where audio and visual elements are analyzed together for tasks like content moderation or video indexing.
Learn more from the following resources:
- [@article@The State of Audio Processing](https://appwrite.io/blog/post/state-of-audio-processing)
- [@video@Audio Signal Processing for Machine Learning](https://www.youtube.com/watch?v=iCwMQJnKk2c)

@ -1 +1,8 @@
# AWS Sagemaker
# AWS SageMaker
AWS SageMaker is a fully managed machine learning service from Amazon Web Services that enables developers and data scientists to build, train, and deploy machine learning models at scale. It provides an integrated development environment, simplifying the entire ML workflow, from data preparation and model development to training, tuning, and inference. SageMaker supports popular ML frameworks like TensorFlow, PyTorch, and Scikit-learn, and offers features like automated model tuning, model monitoring, and one-click deployment. It's designed to make machine learning more accessible and scalable, even for large enterprise applications.
Learn more from the following resources:
- [@official@AWS SageMaker](https://aws.amazon.com/sagemaker/)
- [@video@Introduction to Amazon SageMaker](https://www.youtube.com/watch?v=Qv_Tr_BCFCQ)

@ -1 +1,8 @@
# Azure AI
Azure AI is a suite of AI services and tools provided by Microsoft through its Azure cloud platform. It includes pre-built AI models for natural language processing, computer vision, and speech, as well as tools for developing custom machine learning models using services like Azure Machine Learning. Azure AI enables developers to integrate AI capabilities into applications with APIs for tasks like sentiment analysis, image recognition, and language translation. It also supports responsible AI development with features for model monitoring, explainability, and fairness, aiming to make AI accessible, scalable, and secure across industries.
Learn more from the following resources:
- [@official@Azure AI](https://azure.microsoft.com/en-gb/solutions/ai)
- [@video@How to Choose the Right Models for Your Apps](https://www.youtube.com/watch?v=sx_uGylH8eg)

@ -1 +1,8 @@
# Benefits of Pre-trained Models
Pre-trained models offer several benefits in AI engineering by significantly reducing development time and computational resources because these models are trained on large datasets and can be fine-tuned for specific tasks, which enables quicker deployment and better performance with less data. They help overcome the challenge of needing vast amounts of labeled data and computational power for training from scratch. Additionally, pre-trained models often demonstrate improved accuracy, generalization, and robustness across different tasks, making them ideal for applications in natural language processing, computer vision, and other AI domains.
Learn more from the following resources:
- [@article@Why Pre-Trained Models Matter For Machine Learning](https://www.ahead.com/resources/why-pre-trained-models-matter-for-machine-learning/)
- [@article@Why You Should Use Pre-Trained Models Versus Building Your Own](https://cohere.com/blog/pre-trained-vs-in-house-nlp-models)

@ -1 +1,9 @@
# Bias and Fareness
# Bias and Faireness
Bias and fairness in AI refer to the challenges of ensuring that machine learning models do not produce discriminatory or skewed outcomes. Bias can arise from imbalanced training data, flawed assumptions, or biased algorithms, leading to unfair treatment of certain groups based on race, gender, or other factors. Fairness aims to address these issues by developing techniques to detect, mitigate, and prevent biases in AI systems. Ensuring fairness involves improving data diversity, applying fairness constraints during model training, and continuously monitoring models in production to avoid unintended consequences, promoting ethical and equitable AI use.
Learn more from the following resources:
- [@article@What Do We Do About the Biases in AI?](https://hbr.org/2019/10/what-do-we-do-about-the-biases-in-ai)
- [@article@AI Bias - What Is It and How to Avoid It?](https://levity.ai/blog/ai-bias-how-to-avoid)
- [@article@What about fairness, bias and discrimination?](https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/guidance-on-ai-and-data-protection/how-do-we-ensure-fairness-in-ai/what-about-fairness-bias-and-discrimination/)

@ -1 +1,8 @@
# Capabilities / Context Length
A key aspect of the OpenAI models is their context length, which refers to the amount of input text the model can process at once. Earlier models like GPT-3 had a context length of up to 4,096 tokens (words or word pieces), while more recent models like GPT-4 can handle significantly larger context lengths, some supporting up to 32,768 tokens. This extended context length enables the models to handle more complex tasks, such as maintaining long conversations or processing lengthy documents, which enhances their utility in real-world applications like legal document analysis or code generation.
Learn more from the following resources:
- [@official@Managing Context](https://platform.openai.com/docs/guides/text-generation/managing-context-for-text-generation)
- [@official@Capabilities](https://platform.openai.com/docs/guides/text-generation)

@ -1 +1,8 @@
# Chat Completions API
The OpenAI Chat Completions API is a powerful interface that allows developers to integrate conversational AI into applications by utilizing models like GPT-3.5 and GPT-4. It is designed to manage multi-turn conversations, keeping context across interactions, making it ideal for chatbots, virtual assistants, and interactive AI systems. With the API, users can structure conversations by providing messages in a specific format, where each message has a role (e.g., "system" to guide the model, "user" for input, and "assistant" for responses).
Learn more from the following resources:
- [@official@Create Chat Completions](https://platform.openai.com/docs/api-reference/chat/create)
- [@article@](https://medium.com/the-ai-archives/getting-started-with-openais-chat-completions-api-in-2024-462aae00bf0a)

@ -1 +1,9 @@
# Chunking
The chunking step in Retrieval-Augmented Generation (RAG) involves breaking down large documents or data sources into smaller, manageable chunks. This is done to ensure that the retriever can efficiently search through large volumes of data while staying within the token or input limits of the model. Each chunk, typically a paragraph or section, is converted into an embedding, and these embeddings are stored in a vector database. When a query is made, the retriever searches for the most relevant chunks rather than the entire document, enabling faster and more accurate retrieval.
Learn more from the following resources:
- [@article@Understanding LangChain's RecursiveCharacterTextSplitter](https://dev.to/eteimz/understanding-langchains-recursivecharactertextsplitter-2846)
- [@article@Chunking Strategies for LLM Applications](https://www.pinecone.io/learn/chunking-strategies/)
- [@article@A Guide to Chunking Strategies for Retrieval Augmented Generation](https://zilliz.com/learn/guide-to-chunking-strategies-for-rag)

@ -1 +1,10 @@
# Code Completion Tools
Code completion tools are AI-powered development assistants designed to enhance productivity by automatically suggesting code snippets, functions, and entire blocks of code as developers type. These tools, such as GitHub Copilot and Tabnine, leverage machine learning models trained on vast code repositories to predict and generate contextually relevant code. They help reduce repetitive coding tasks, minimize errors, and accelerate the development process by offering real-time, intelligent suggestions.
Learn more from the following resources:
- [@official@GitHub Copilot](https://github.com/features/copilot)
- [@official@Codeium](https://codeium.com/)
- [@official@Supermaven](https://supermaven.com/)
- [@official@Tabnine](https://www.tabnine.com/)

@ -1 +1,8 @@
# Cohere
Cohere is an AI platform that specializes in natural language processing (NLP) by providing large language models designed to help developers build and deploy text-based applications. Cohere’s models are used for tasks such as text classification, language generation, semantic search, and sentiment analysis. Unlike some other providers, Cohere emphasizes simplicity and scalability, offering an easy-to-use API that allows developers to fine-tune models on custom data for specific use cases. Additionally, Cohere provides robust multilingual support and focuses on ensuring that its NLP solutions are both accessible and enterprise-ready, catering to a wide range of industries.
Learn more from the following resources:
- [@official@Cohere Website](https://cohere.com/)
- [@article@What Does Cohere Do?](https://medium.com/geekculture/what-does-cohere-do-cdadf6d70435)

@ -1 +1,8 @@
# Conducting adversarial testing
Adversarial testing involves intentionally exposing machine learning models to deceptive, perturbed, or carefully crafted inputs to evaluate their robustness and identify vulnerabilities. The goal is to simulate potential attacks or edge cases where the model might fail, such as subtle manipulations in images, text, or data that cause the model to misclassify or produce incorrect outputs. This type of testing helps to improve model resilience, particularly in sensitive applications like cybersecurity, autonomous systems, and finance.
Learn more from the following resources:
- [@article@Adversarial Testing for Generative AI](https://developers.google.com/machine-learning/resources/adv-testing)
- [@article@Adversarial Testing: Definition, Examples and Resources](https://www.leapwork.com/blog/adversarial-testing)

@ -1 +1,8 @@
# Constraining outputs and inputs
Constraining outputs and inputs in AI models refers to implementing limits or rules that guide both the data the model processes (inputs) and the results it generates (outputs). Input constraints ensure that only valid, clean, and well-formed data enters the model, which helps to reduce errors and improve performance. This can include setting data type restrictions, value ranges, or specific formats. Output constraints, on the other hand, ensure that the model produces appropriate, safe, and relevant results, often by limiting output length, specifying answer formats, or applying filters to avoid harmful or biased responses. These constraints are crucial for improving model safety, alignment, and utility in practical applications.
Learn more from the following resources:
- [@article@Preventing Prompt Injection](https://learnprompting.org/docs/prompt_hacking/defensive_measures/introduction)
- [@article@Introducing Structured Outputs in the API - OpenAI](https://openai.com/index/introducing-structured-outputs-in-the-api/)

@ -1 +1,8 @@
# Cut-off Dates / Knowledge
OpenAI models, such as GPT-3.5 and GPT-4, have a knowledge cutoff date, which refers to the last point in time when the model was trained on data. For instance, as of the current version of GPT-4, the knowledge cutoff is October 2023. This means the model does not have awareness or knowledge of events, advancements, or data that occurred after that date. Consequently, the model may lack information on more recent developments, research, or real-time events unless explicitly updated in future versions. This limitation is important to consider when using the models for time-sensitive tasks or inquiries involving recent knowledge.
Learn more from the following resources:
- [@article@Knowledge Cutoff Dates of all LLMs explained](https://otterly.ai/blog/knowledge-cutoff/)
- [@article@Knowledge Cutoff Dates For ChatGPT, Meta Ai, Copilot, Gemini, Claude](https://computercity.com/artificial-intelligence/knowledge-cutoff-dates-llms)

@ -1 +1,8 @@
# DALL-E API
The DALL-E API is a tool provided by OpenAI that allows developers to integrate the DALL-E image generation model into applications. DALL-E is an AI model designed to generate images from textual descriptions, capable of producing highly detailed and creative visuals. The API enables users to provide a descriptive prompt, and the model generates corresponding images, opening up possibilities in fields like design, advertising, content creation, and art.
Learn more from the following resources:
- [@official@OpenAI Image Generation](https://platform.openai.com/docs/guides/images)
- [@video@DALL E API - Introduction (Generative AI Pictures from OpenAI)](https://www.youtube.com/watch?v=Zr6vAWwjHN0)

@ -1 +1,7 @@
# Data Classification
Once data is embedded, a classification algorithm, such as a neural network or a logistic regression model, can be trained on these embeddings to classify the data into different categories. The advantage of using embeddings is that they capture underlying relationships and similarities between data points, even if the raw data is complex or high-dimensional, improving classification accuracy in tasks like text classification, image categorization, and recommendation systems.
Learn more from the following resources:
- [@video@Text Embeddings, Classification, and Semantic Search (w/ Python Code)](https://www.youtube.com/watch?v=sNa_uiqSlJo)

@ -1 +1,8 @@
# Development Tools
AI has given rise to a collection of AI powered development tools of various different varieties. We have IDEs like Cursor that has AI baked into it, live context capturing tools such as Pieces and a variety of brower based tools like V0, Claude and more.
- [@official@v0 Website](https://v0.dev)
- [@official@Aider - AI Pair Programming in Terminal](https://github.com/Aider-AI/aider)
- [@official@Replit AI](https://replit.com/ai)
- [@official@Pieces Website](https://pieces.app)

@ -1 +1,8 @@
# Embedding
In Retrieval-Augmented Generation (RAG), embeddings are essential for linking information retrieval with natural language generation. Embeddings represent both the user query and documents as dense vectors in a shared space, enabling the system to retrieve relevant information based on similarity. This retrieved information is then fed into a generative model, such as GPT, to produce contextually informed and accurate responses. By using embeddings, RAG enhances the model's ability to generate content grounded in external knowledge, making it effective for tasks like question answering and summarization.
Learn more from the following resources:
- [@article@Understanding the role of embeddings in RAG LLMs](https://www.aporia.com/learn/understanding-the-role-of-embeddings-in-rag-llms/)
- [@article@Mastering RAG: How to Select an Embedding Model](https://www.rungalileo.io/blog/mastering-rag-how-to-select-an-embedding-model)

@ -1 +1,9 @@
# Embeddings
Embeddings are dense, continuous vector representations of data, such as words, sentences, or images, in a lower-dimensional space. They capture the semantic relationships and patterns in the data, where similar items are placed closer together in the vector space. In machine learning, embeddings are used to convert complex data into numerical form that models can process more easily. For example, word embeddings represent words based on their meanings and contexts, allowing models to understand relationships like synonyms or analogies. Embeddings are widely used in tasks like natural language processing, recommendation systems, and image recognition to improve model performance and efficiency.
Learn more from the following resources:
- [@article@What are embeddings in machine learning?](https://www.cloudflare.com/en-gb/learning/ai/what-are-embeddings/)
- [@article@What is embedding?](https://www.ibm.com/topics/embedding)
- [@video@What are Word Embeddings](https://www.youtube.com/watch?v=wgfSDrqYMJ4)

@ -1 +1,9 @@
# FAISS
FAISS (Facebook AI Similarity Search) is a library developed by Facebook AI for efficient similarity search and clustering of dense vectors, particularly useful for large-scale datasets. It is optimized to handle embeddings (vector representations) and enables fast nearest neighbor search, allowing you to retrieve similar items from a large collection of vectors based on distance or similarity metrics like cosine similarity or Euclidean distance. FAISS is widely used in applications such as image and text retrieval, recommendation systems, and large-scale search systems where embeddings are used to represent items. It offers several indexing methods and can scale to billions of vectors, making it a powerful tool for handling real-time, large-scale similarity search problems efficiently.
Learn more from the following resources:
- [@official@FAISS](https://ai.meta.com/tools/faiss/)
- [@video@FAISS Vector Library with LangChain and OpenAI](https://www.youtube.com/watch?v=ZCSsIkyCZk4)
- [@article@What Is Faiss (Facebook AI Similarity Search)?](https://www.datacamp.com/blog/faiss-facebook-ai-similarity-search)

@ -1 +1,8 @@
# Fine-tuning
Fine-tuning the OpenAI API involves adapting pre-trained models, such as GPT, to specific use cases by training them on custom datasets. This process allows you to refine the model's behavior and improve its performance on specialized tasks, like generating domain-specific text or following particular patterns. By providing labeled examples of the desired input-output pairs, you guide the model to better understand and predict the appropriate responses for your use case.
Learn more from the following resources:
- [@official@Fine-tuning Documentation](https://platform.openai.com/docs/guides/fine-tuning)
- [@video@Fine-tuning ChatGPT with OpenAI Tutorial](https://www.youtube.com/watch?v=VVKcSf6r3CM)

@ -1 +1,8 @@
# Generation
Generation refers to the process where a generative language model, such as GPT, creates a response based on the information retrieved during the retrieval phase. After relevant documents or data snippets are identified using embeddings, they are passed to the generative model, which uses this information to produce coherent, context-aware, and informative responses. The retrieved content helps the model stay grounded and factual, enhancing its ability to answer questions, provide summaries, or engage in dialogue by combining retrieved knowledge with its natural language generation capabilities. This synergy between retrieval and generation makes RAG systems effective for tasks that require detailed, accurate, and contextually relevant outputs.
Learn more from the following resources:
- [@article@What is RAG (Retrieval-Augmented Generation)?](https://aws.amazon.com/what-is/retrieval-augmented-generation/)
- [@video@Retrieval Augmented Generation (RAG) Explained in 8 Minutes!](https://www.youtube.com/watch?v=HREbdmOSQ18)

@ -1 +1,8 @@
# Google's Gemini
Google Gemini is an advanced AI model by Google DeepMind, designed to integrate natural language processing with multimodal capabilities, enabling it to understand and generate not just text but also images, videos, and other data types. It combines generative AI with reasoning skills, making it effective for complex tasks requiring logical analysis and contextual understanding. Built on Google's extensive knowledge base and infrastructure, Gemini aims to offer high accuracy, efficiency, and safety, positioning it as a competitor to models like OpenAI's GPT-4.
Learn more from the following resources:
- [@official@Google Gemini](https://workspace.google.com/solutions/ai/)
- [@video@Welcome to the Gemini era](https://www.youtube.com/watch?v=_fuimO6ErKI)

@ -1 +1,7 @@
# Hugging Face Models
Hugging Face models are a collection of pre-trained machine learning models available through the Hugging Face platform, covering a wide range of tasks like natural language processing, computer vision, and audio processing. The platform includes models for tasks such as text classification, translation, summarization, question answering, and more, with popular models like BERT, GPT, T5, and CLIP. Hugging Face provides easy-to-use tools and APIs that allow developers to access, fine-tune, and deploy these models, fostering a collaborative community where users can share, modify, and contribute models to improve AI research and application development.
Learn more from the following resources:
- [@official@Hugging Face Models](https://huggingface.co/models)

@ -1 +1,8 @@
# Hugging Face Models
Hugging Face models are a collection of pre-trained machine learning models available through the Hugging Face platform, covering a wide range of tasks like natural language processing, computer vision, and audio processing. The platform includes models for tasks such as text classification, translation, summarization, question answering, and more, with popular models like BERT, GPT, T5, and CLIP. Hugging Face provides easy-to-use tools and APIs that allow developers to access, fine-tune, and deploy these models, fostering a collaborative community where users can share, modify, and contribute models to improve AI research and application development.
Learn more from the following resources:
- [@official@Hugging Face Models](https://huggingface.co/models)
- [@video@How to Use Pretrained Models from Hugging Face in a Few Lines of Code](https://www.youtube.com/watch?v=ntz160EnWIc)

@ -1,8 +1,9 @@
# Hugging Face
Hugging Face is often called the GitHub of machine learning because it lets developers share and test their work openly. Hugging Face is known for its `Transformers Python library`, which simplifies the process of `downloading and training ML models`. It promotes collaboration within the AI community by enabling users to `share models` and `datasets`, thus advancing the democratization of artificial intelligence through open-source practices.
Hugging Face is a leading AI company and open-source platform that provides tools, models, and libraries for natural language processing (NLP), computer vision, and other machine learning tasks. It is best known for its "Transformers" library, which simplifies the use of pre-trained models like BERT, GPT, T5, and CLIP, making them accessible for tasks such as text classification, translation, summarization, and image recognition.
Learn more from the following resources:
- [@official@Hugging Face](https://huggingface.co/)
- [@official@Github](https://github.com/huggingface)
- [@official@Hugging Face Website](https://huggingface.co)
- [@video@What is Hugging Face? - Machine Learning Hub Explained](https://www.youtube.com/watch?v=1AUjKfpRZVo)
- [@course@Hugging Face Official Video Course](https://www.youtube.com/watch?v=00GKzGyWFEs&list=PLo2EIpI_JMQvWfQndUesu0nPBAtZ9gP1o)

@ -1 +1,9 @@
# Image Generation
Image generation is a process in artificial intelligence where models create new images based on input prompts or existing data. It involves using generative models like GANs (Generative Adversarial Networks), VAEs (Variational Autoencoders), or more recently, transformer-based models like DALL-E and Stable Diffusion.
Learn more from the following resources:
- [@official@DALL-E Website](https://openai.com/index/dall-e-2/)
- [@article@How DALL-E 2 Actually Works](https://www.assemblyai.com/blog/how-dall-e-2-actually-works/)
- [@video@How AI Image Generators Work (Stable Diffusion / Dall-E)](https://www.youtube.com/watch?v=1CIpzeNxIhU)

@ -1 +1,7 @@
# Image Understanding
Multimodal AI enhances image understanding by integrating visual data with other types of information, such as text or audio. By combining these inputs, AI models can interpret images more comprehensively, recognizing objects, scenes, and actions, while also understanding context and related concepts. For example, an AI system could analyze an image and generate descriptive captions, or provide explanations based on both visual content and accompanying text.
Learn more from the following resources:
- [@article@Low or high fidelity image understanding - OpenAI](https://platform.openai.com/docs/guides/vision/low-or-high-fidelity-image-understanding)

@ -1 +1,8 @@
# Impact on Product Development
AI engineering transforms product development by automating tasks, enhancing data-driven decision-making, and enabling the creation of smarter, more personalized products. It speeds up design cycles, optimizes processes, and allows for predictive maintenance, quality control, and efficient resource management. By integrating AI, companies can innovate faster, reduce costs, and improve user experiences, giving them a competitive edge in the market.
Learn more from the following resources:
- [@article@AI in Product Development: Netflix, BMW, and PepsiCo](https://www.virtasant.com/ai-today/ai-in-product-development-netflix-bmw#:~:text=AI%20can%20help%20make%20product,and%20gain%20a%20competitive%20edge.)
- [@article@AI Product Development: Why Are Founders So Fascinated By The Potential?](https://www.techmagic.co/blog/ai-product-development/)

@ -1 +1,8 @@
# Indexing Embeddings
Embeddings are stored in a vector database by first converting data, such as text, images, or audio, into high-dimensional vectors using machine learning models. These vectors, also called embeddings, capture the semantic relationships and patterns within the data. Once generated, each embedding is indexed in the vector database along with its associated metadata, such as the original data (e.g., text or image) or an identifier. The vector database then organizes these embeddings to support efficient similarity searches, typically using techniques like approximate nearest neighbor (ANN) search.
Learn more from the following resources:
- [@article@Indexing & Embeddings](https://docs.llamaindex.ai/en/stable/understanding/indexing/indexing/)
- [@video@Vector Databases simply explained! (Embeddings & Indexes)](https://www.youtube.com/watch?v=dN0lsF2cvm4)

@ -1 +1,9 @@
# Inference
In artificial intelligence (AI), inference refers to the process where a trained machine learning model makes predictions or draws conclusions from new, unseen data. Unlike training, inference involves the model applying what it has learned to make decisions without needing examples of the exact result. In essence, inference is the AI model actively functioning. For example, a self-driving car recognizing a stop sign on a road it has never encountered before demonstrates inference. The model identifies the stop sign in a new setting, using its learned knowledge to make a decision in real-time.
Learn more from the following resources:
- [@article@Inference vs Training](https://www.cloudflare.com/learning/ai/inference-vs-training/)
- [@article@What is Machine Learning Inference?](https://hazelcast.com/glossary/machine-learning-inference/)
- [@article@What is Machine Learning Inference? An Introduction to Inference Approaches](https://www.datacamp.com/blog/what-is-machine-learning-inference)

@ -1 +1,3 @@
# Introduction
AI Engineering is the process of designing and implementing AI systems using pre-trained models and existing AI tools to solve practical problems. AI Engineers focus on applying AI in real-world scenarios, improving user experiences, and automating tasks, without developing new models from scratch. They work to ensure AI systems are efficient, scalable, and can be seamlessly integrated into business applications, distinguishing their role from AI Researchers and ML Engineers, who concentrate more on creating new models or advancing AI theory.

@ -1 +1,7 @@
# Know your Customers / Usecases
To know your customer means deeply understanding the needs, behaviors, and expectations of your target users. This ensures the tools you create are tailored precisely for their intended purpose, while also being designed to prevent misuse or unintended applications. By clearly defining the tool’s functionality and boundaries, you can align its features with the users’ goals while incorporating safeguards that limit its use in contexts it wasn’t designed for. This approach enhances both the tool’s effectiveness and safety, reducing the risk of improper use.
Learn more from the following resources:
- [@article@Assigning Roles](https://learnprompting.org/docs/basics/roles)

@ -1 +1,8 @@
# LanceDB
LanceDB is a vector database designed for efficient storage, retrieval, and management of embeddings. It enables users to perform fast similarity searches, particularly useful in applications like recommendation systems, semantic search, and AI-driven content retrieval. LanceDB focuses on scalability and speed, allowing large-scale datasets of embeddings to be indexed and queried quickly, which is essential for real-time AI applications. It integrates well with machine learning workflows, making it easier to deploy models that rely on vector-based data processing, and helps manage the complexities of handling high-dimensional vector data efficiently.
Learn more from the following resources:
- [@official@LanceDB Website](https://lancedb.com/)
- [@opensource@LanceDB on GitHub](https://github.com/lancedb/lancedb)

@ -1 +1,8 @@
# LangChain for Multimodal Apps
LangChain is a framework designed to build applications that integrate multiple AI models, especially those focusing on language understanding, generation, and multimodal capabilities. For multimodal apps, LangChain facilitates seamless interaction between text, image, and even audio models, enabling developers to create complex workflows that can process and analyze different types of data.
Learn more from the following resources:
- [@official@LangChain Website](https://www.langchain.com/)
- [@video@Build a Multimodal GenAI App with LangChain and Gemini LLMs](https://www.youtube.com/watch?v=bToMzuiOMhg)

@ -1 +1,8 @@
# Langchain
LangChain is a development framework that simplifies building applications powered by language models, enabling seamless integration of multiple AI models and data sources. It focuses on creating chains, or sequences, of operations where language models can interact with databases, APIs, and other models to perform complex tasks. LangChain offers tools for prompt management, data retrieval, and workflow orchestration, making it easier to develop robust, scalable applications like chatbots, automated data analysis, and multi-step reasoning systems.
Learn more from the following resources:
- [@official@LangChain Website](https://www.langchain.com/)
- [@video@What is LangChain?](https://www.youtube.com/watch?v=1bUy-1hGZpI)

@ -1 +1,8 @@
# Limitations and Considerations
Pre-trained models, while powerful, come with several limitations and considerations. They may carry biases present in the training data, leading to unintended or discriminatory outcomes, these models are also typically trained on general data, so they might not perform well on niche or domain-specific tasks without further fine-tuning. Another concern is the "black-box" nature of many pre-trained models, which can make their decision-making processes hard to interpret and explain.
Learn more from the following resources:
- [@article@Pretrained Topic Models: Advantages and Limitation](https://www.kaggle.com/code/amalsalilan/pretrained-topic-models-advantages-and-limitation)
- [@video@Should You Use Open Source Large Language Models?](https://www.youtube.com/watch?v=y9k-U9AuDeM)

@ -1 +1,8 @@
# Llama Index
# LlamaIndex
LlamaIndex, formerly known as GPT Index, is a tool designed to facilitate the integration of large language models (LLMs) with structured and unstructured data sources. It acts as a data framework that helps developers build retrieval-augmented generation (RAG) applications by indexing various types of data, such as documents, databases, and APIs, enabling LLMs to query and retrieve relevant information efficiently.
Learn more from the following resources:
- [@official@llamaindex Website](https://docs.llamaindex.ai/en/stable/)
- [@video@Introduction to LlamaIndex with Python (2024)](https://www.youtube.com/watch?v=cCyYGYyCka4)

@ -1 +1,8 @@
# LlamaIndex for Multimodal Apps
# LlamaIndex for Multi-modal Apps
LlamaIndex enables multi-modal apps by linking language models (LLMs) to diverse data sources, including text and images. It indexes and retrieves information across formats, allowing LLMs to process and integrate data from multiple modalities. This supports applications like visual question answering, content summarization, and interactive systems by providing structured, context-aware inputs from various content types.
Learn more from the following resources:
- [@official@LlamaIndex Multy-modal](https://docs.llamaindex.ai/en/stable/use_cases/multimodal/)
- [@video@Multi-modal Retrieval Augmented Generation with LlamaIndex](https://www.youtube.com/watch?v=35RlrrgYDyU)

@ -1,9 +1,9 @@
# LLMs
Large Language Models (LLMs) are advanced artificial intelligence programs designed to comprehend and generate human language text.
LLMs, or Large Language Models, are advanced AI models trained on vast datasets to understand and generate human-like text. They can perform a wide range of natural language processing tasks, such as text generation, translation, summarization, and question answering. Examples include GPT-4, BERT, and T5. LLMs are capable of understanding context, handling complex queries, and generating coherent responses, making them useful for applications like chatbots, content creation, and automated support. However, they require significant computational resources and may carry biases from their training data.
Visit the following resources to learn more:
Learn more from the following resources:
- [@article@What is a large language model (LLM)?](https://www.cloudflare.com/learning/ai/what-is-large-language-model/)
- [@article@Large language model](https://en.wikipedia.org/wiki/Large_language_model)
- [@video@How Large Language Models Work](https://www.youtube.com/watch?v=5sLYAQS9sWQ&t=1s)
- [@article@What is a large language model (LLM)?](https://www.cloudflare.com/en-gb/learning/ai/what-is-large-language-model/)
- [@video@How Large Langauge Models Work](https://www.youtube.com/watch?v=5sLYAQS9sWQ)
- [@video@Large Language Models (LLMs) - Everything You NEED To Know](https://www.youtube.com/watch?v=osKyvYJ3PRM)

@ -1 +1,8 @@
# Manual Implementation
Services like [Open AI functions](https://platform.openai.com/docs/guides/function-calling) and Tools or [Vercel's AI SDK](https://sdk.vercel.ai/docs/foundations/tools) make it really easy to make SDK agents however it is a good idea to learn how these tools work under the hood. You can also create fully custom implementation of agents using by implementing custom loop.
Learn more from the following resources:
- [@official@OpenAI Function Calling](https://platform.openai.com/docs/guides/function-calling)
- [@official@Vercel AI SDK](https://sdk.vercel.ai/docs/foundations/tools)

@ -1 +1,8 @@
# Maximum Tokens
The OpenAI API has different maximum token limits depending on the model being used. For instance, GPT-3 has a limit of 4,096 tokens, while GPT-4 can support larger inputs, with some versions allowing up to 8,192 tokens, and extended versions reaching up to 32,768 tokens. Tokens include both the input text and the generated output, so longer inputs mean less space for responses. Managing token limits is crucial to ensure the model can handle the entire input and still generate a complete response, especially for tasks involving lengthy documents or multi-turn conversations.
Learn more from the following resources:
- [@official@Maximum Tokens](https://platform.openai.com/docs/guides/rate-limits)
- [@article@The Ins and Outs of GPT Token Limits](https://www.supernormal.com/blog/gpt-token-limits)

@ -1 +1,8 @@
# Mistral AI
Mistral AI is a company focused on developing open-weight, large language models (LLMs) to provide high-performance AI solutions. Mistral aims to create models that are both efficient and versatile, making them suitable for a wide range of natural language processing tasks, including text generation, translation, and summarization. By releasing open-weight models, Mistral promotes transparency and accessibility, allowing developers to customize and deploy AI solutions more flexibly compared to proprietary models.
Learn more from the resources:
- [@official@Minstral AI Website](https://mistral.ai/)
- [@video@Mistral AI: The Gen AI Start-up you did not know existed](https://www.youtube.com/watch?v=vzrRGd18tAg)

@ -1 +1,7 @@
# MongoDB Atlas
MongoDB Atlas, traditionally known for its document database capabilities, now includes vector search functionality, making it a strong option as a vector database. This feature allows developers to store and query high-dimensional vector data alongside regular document data. With Atlas’s vector search, users can perform similarity searches on embeddings of text, images, or other complex data, making it ideal for AI and machine learning applications like recommendation systems, image similarity search, and natural language processing tasks. The seamless integration of vector search within the MongoDB ecosystem allows developers to leverage familiar tools and interfaces while benefiting from advanced vector-based operations for sophisticated data analysis and retrieval.
Learn more from the following resources:
- [@official@Vector Search in MongoDB Atlas](https://www.mongodb.com/products/platform/atlas-vector-search)

@ -1 +1,7 @@
# Multimodal AI Usecases
Multimodal AI powers applications like visual question answering, content moderation, and enhanced search engines. It drives smarter virtual assistants and interactive AR apps, combining text, images, and audio for richer, more intuitive user experiences across e-commerce, accessibility, and entertainment.
Learn more from the following resources:
- [@official@Hugging Face Multimodal Models](https://huggingface.co/learn/computer-vision-course/en/unit4/multimodal-models/a_multimodal_world)

@ -1 +1,9 @@
# Multimodal AI
Multimodal AI is an approach that combines and processes data from multiple sources, such as text, images, audio, and video, to understand and generate responses. By integrating different data types, it enables more comprehensive and accurate AI systems, allowing for tasks like visual question answering, interactive virtual assistants, and enhanced content understanding. This capability helps create richer, more context-aware applications that can analyze and respond to complex, real-world scenarios.
Learn more from the following resources:
- [@article@A Multimodal World - Hugging Face](https://huggingface.co/learn/computer-vision-course/en/unit4/multimodal-models/a_multimodal_world)
- [@article@Multimodal AI - Google](https://cloud.google.com/use-cases/multimodal-ai?hl=en)
- [@article@What Is Multimodal AI? A Complete Introduction](https://www.splunk.com/en_us/blog/learn/multimodal-ai.html)

@ -1,8 +1,8 @@
# Ollama Models
Ollama includes popular options like `Llama 2, Mistral, and Code Llama`. It simplifies the deployment process by bundling model weights, configurations, and datasets into a single package managed by a `Modelfile`, allowing users to easily manage and interact with these models. The platform's extensive library allows users to choose models tailored to their specific needs, and reduces reliance in cloud. Ollama Models could be of `text/base`, `chat/instruct` or `multi modal`.
Ollama provides a collection of large language models (LLMs) designed to run locally on personal devices, enabling privacy-focused and efficient AI applications without relying on cloud services. These models can perform tasks like text generation, translation, summarization, and question answering, similar to popular models like GPT. Ollama emphasizes ease of use, offering models that are optimized for lower resource consumption, making it possible to deploy AI capabilities directly on laptops or edge devices.
Learn more from the following resources:
- [@official@Ollama Model Library](https://ollama.com/library)
- [@course@Ollama Free Course](https://youtu.be/f4tXwCNP1Ac?si=0RRKIfw2XAsWNNBo)
- [@video@What are the different types of models? Ollama Course](https://www.youtube.com/watch?v=f4tXwCNP1Ac)

@ -1,8 +1,8 @@
# Ollama
Ollama is a powerful open-source tool designed to run large language models (LLMs) locally on users' machines, It exposes a `local API`, allowing developers to seamlessly integrate LLMs into their applications and workflows. This API facilitates efficient communication between your application and the LLM, enabling you to send prompts, receive responses, and leverage the full potential of these **powerful AI models**.
Ollama is a platform that offers large language models (LLMs) designed to run locally on personal devices, enabling AI functionality without relying on cloud services. It focuses on privacy, performance, and ease of use by allowing users to deploy models directly on laptops, desktops, or edge devices, providing fast, offline AI capabilities. With tools like the Ollama SDK, developers can integrate these models into their applications for tasks such as text generation, summarization, and more, benefiting from reduced latency, greater data control, and seamless local processing.
Learn more from the following resources:
- [@official@Ollama](https://ollama.com/)
- [@article@Ollama Explained](https://www.geeksforgeeks.org/ollama-explained-transforming-ai-accessibility-and-language-processing/)
- [@official@Ollama Website](https://ollama.com/)
- [@article@Ollama: Easily run LLMs locally](https://klu.ai/glossary/ollama)

@ -1 +1,8 @@
# Open AI Assistant API
# OpenAI Assistant API
The OpenAI Assistant API enables developers to create advanced conversational systems using models like GPT-4. It supports multi-turn conversations, allowing the AI to maintain context across exchanges, which is ideal for chatbots, virtual assistants, and interactive applications. Developers can customize interactions by defining roles, such as system, user, and assistant, to guide the assistant's behavior. With features like temperature control, token limits, and stop sequences, the API offers flexibility to ensure responses are relevant, safe, and tailored to specific use cases.
Learn more from the following resources:
- [@official@Assistants API](https://platform.openai.com/docs/assistants/overview)
- [@course@OpenAI Assistants API – Course for Beginners](https://www.youtube.com/watch?v=qHPonmSX4Ms)

@ -1 +1,8 @@
# Open AI Embedding Models
# OpenAI Embedding Models
OpenAI's embedding models convert text into dense vector representations that capture semantic meaning, allowing for efficient similarity searches, clustering, and recommendations. These models are commonly used for tasks like semantic search, where similar phrases are mapped to nearby points in a vector space, and for building recommendation systems by comparing embeddings to find related content. OpenAI's embedding models offer versatility, supporting a range of applications from document retrieval to content classification, and can be easily integrated through the OpenAI API for scalable and efficient deployment.
Learn more from the following resources:
- [@official@OpenAI Embedding Models](https://platform.openai.com/docs/guides/embeddings/embedding-models)
- [@video@OpenAI Embeddings Explained in 5 Minutes](https://www.youtube.com/watch?v=8kJStTRuMcs)

@ -1 +1,7 @@
# Open AI Embeddings API
# OpenAI Embeddings API
The OpenAI Embeddings API allows developers to generate dense vector representations of text, which capture semantic meaning and relationships. These embeddings can be used for various tasks, such as semantic search, recommendation systems, and clustering, by enabling the comparison of text based on similarity in vector space. The API supports easy integration and scalability, making it possible to handle large datasets and perform tasks like finding similar documents, organizing content, or building recommendation engines.
Learn more from the following resources:
- [@offical@OpenAI Embeddings API](https://platform.openai.com/docs/api-reference/embeddings/create)
- [@video@Master OpenAI EMBEDDING API](https://www.youtube.com/watch?v=9oCS-VQupoc)

@ -1 +1,8 @@
# Open AI Models
# OpenAI Models
OpenAI provides a variety of models designed for diverse tasks. GPT models like GPT-3 and GPT-4 handle text generation, conversation, and translation, offering context-aware responses, while Codex specializes in generating and debugging code across multiple languages. DALL-E creates images from text descriptions, supporting applications in design and content creation, and Whisper is a speech recognition model that converts spoken language to text for transcription and voice-to-text tasks.
Learn more from the following resources:
- [@official@OpenAI Models Overview](https://platform.openai.com/docs/models)
- [@video@OpenAI’s new “deep-thinking” o1 model crushes coding benchmarks](https://www.youtube.com/watch?v=6xlPJiNpCVw)

@ -1 +1,8 @@
# Open AI Playground
# OpenAI Playground
The OpenAI Playground is an interactive web interface that allows users to experiment with OpenAI's language models, such as GPT-3 and GPT-4, without needing to write code. It provides a user-friendly environment where you can input prompts, adjust parameters like temperature and token limits, and see how the models generate responses in real-time. The Playground helps users test different use cases, from text generation to question answering, and refine prompts for better outputs. It's a valuable tool for exploring the capabilities of OpenAI models, prototyping ideas, and understanding how the models behave before integrating them into applications.
Learn more from the following resources:
- [@official@OpenAI Playground](https://platform.openai.com/playground/chat)
- [@video@How to Use OpenAi Playground Like a Pro](https://www.youtube.com/watch?v=PLxpvtODiqs)

@ -1 +1,3 @@
# Open-Source Embeddings
Open-source embeddings are pre-trained vector representations of data, usually text, that are freely available for use and modification. These embeddings capture semantic meanings, making them useful for tasks like semantic search, text classification, and clustering. Examples include Word2Vec, GloVe, and FastText, which represent words as vectors based on their context in large corpora, and more advanced models like Sentence-BERT and CLIP that provide embeddings for sentences and images. Open-source embeddings allow developers to leverage pre-trained models without starting from scratch, enabling faster development and experimentation in natural language processing and other AI applications.

@ -1,8 +1,8 @@
# Open vs Closed Source Models
Open-source AI refers to models and software with publicly accessible source code, promoting collaboration, transparency, and cost-effectiveness, but it can face challenges like quality control and security risks. In contrast, closed-source AI involves proprietary models that are not publicly available, offering higher quality, performance, and security due to significant corporate investment, but lacking transparency and community collaboration. Some of them were `Llama` for Open Source Model and `Open AI` for Closed Source Model.
Open-source models are freely available for customization and collaboration, promoting transparency and flexibility, while closed-source models are proprietary, offering ease of use but limiting modification and transparency.
Learn more from the following resources:
- [@article@Open AI vs Closed AI](https://formtek.com/blog/open-ai-vs-closed-ai-whats-the-difference-and-why-does-it-matter/)
- [@article@Open vs Closed Source Model](https://www.techtarget.com/searchEnterpriseAI/feature/Attributes-of-open-vs-closed-AI-explained)
- [@article@OpenAI vs. open-source LLM](https://ubiops.com/openai-vs-open-source-llm/)
- [@video@AI360 | Open-Source vs Closed-Source LLMs](https://www.youtube.com/watch?v=710PDpuLwOc)

@ -1 +1,3 @@
# OpenAI API
The OpenAI API provides access to powerful AI models like GPT, Codex, DALL-E, and Whisper, enabling developers to integrate capabilities such as text generation, code assistance, image creation, and speech recognition into their applications via a simple, scalable interface.

@ -1 +1,8 @@
# OpenAI Assistant API
The OpenAI Assistant API enables developers to create advanced conversational systems using models like GPT-4. It supports multi-turn conversations, allowing the AI to maintain context across exchanges, which is ideal for chatbots, virtual assistants, and interactive applications. Developers can customize interactions by defining roles, such as system, user, and assistant, to guide the assistant's behavior. With features like temperature control, token limits, and stop sequences, the API offers flexibility to ensure responses are relevant, safe, and tailored to specific use cases.
Learn more from the following resources:
- [@official@Assistants API](https://platform.openai.com/docs/assistants/overview)
- [@course@OpenAI Assistants API – Course for Beginners](https://www.youtube.com/watch?v=qHPonmSX4Ms)

@ -1 +1,8 @@
# OpenAI Functions / Tools
OpenAI Functions, also known as tools, enable developers to extend the capabilities of language models by integrating external APIs and functionalities, allowing the models to perform specific actions, fetch real-time data, or interact with other software systems. This feature enhances the model's utility by bridging it with services like web searches, databases, and custom business applications, enabling more dynamic and task-oriented responses.
Learn more from the following resources:
- [@official@Function Calling](https://platform.openai.com/docs/guides/function-calling)
- [@video@How does OpenAI Function Calling work?](https://www.youtube.com/watch?v=Qor2VZoBib0)

@ -1 +1,7 @@
# OpenAI Models
OpenAI provides a variety of models designed for diverse tasks. GPT models like GPT-3 and GPT-4 handle text generation, conversation, and translation, offering context-aware responses, while Codex specializes in generating and debugging code across multiple languages. DALL-E creates images from text descriptions, supporting applications in design and content creation, and Whisper is a speech recognition model that converts spoken language to text for transcription and voice-to-text tasks.
Learn more from the following resources:
- [@official@OpenAI Models Overview](https://platform.openai.com/docs/models)

@ -1 +1,8 @@
# OpenAI Moderation API
The OpenAI Moderation API helps detect and filter harmful content by analyzing text for issues like hate speech, violence, self-harm, and adult content. It uses machine learning models to identify inappropriate or unsafe language, allowing developers to create safer online environments and maintain community guidelines. The API is designed to be integrated into applications, websites, and platforms, providing real-time content moderation to reduce the spread of harmful or offensive material.
Learn more from the following resources:
- [@official@Moderation](https://platform.openai.com/docs/guides/moderation)
- [@article@How to user the moderation API](https://cookbook.openai.com/examples/how_to_use_moderation)

@ -1 +1,8 @@
# OpenAI Vision API
The OpenAI Vision API enables models to analyze and understand images, allowing them to identify objects, recognize text, and interpret visual content. It integrates image processing with natural language capabilities, enabling tasks like visual question answering, image captioning, and extracting information from photos. This API can be used for applications in accessibility, content moderation, and automation, providing a seamless way to combine visual understanding with text-based interactions.
Learn more from the following resources:
- [@official@Vision](https://platform.openai.com/docs/guides/vision)
- [@video@OpenAI Vision API Crash Course](https://www.youtube.com/watch?v=ZjkS11DSeEk)

@ -1,8 +1,8 @@
# OpenSource AI
Open-source AI refers to the development and deployment of artificial intelligence technologies using open-source practices. This means that the source code is freely accessible, allowing developers to inspect, modify, and distribute AI systems without restrictions.
Open-source AI refers to AI models, tools, and frameworks that are freely available for anyone to use, modify, and distribute. Examples include TensorFlow, PyTorch, and models like BERT and Stable Diffusion. Open-source AI fosters transparency, collaboration, and innovation by allowing developers to inspect code, adapt models for specific needs, and contribute improvements. This approach accelerates the development of AI technologies, enabling faster experimentation and reducing dependency on proprietary solutions.
Learn more from the following resources:
- [@article@The Open Source AI Definition](https://opensource.org/deepdive/drafts/the-open-source-ai-definition-draft-v-0-0-3)
- [@article@Defining Open Source AI](https://www.technologyreview.com/2024/08/22/1097224/we-finally-have-a-definition-for-open-source-ai/)
- [@article@Open Source AI Is the Path Forward](https://about.fb.com/news/2024/07/open-source-ai-is-the-path-forward/)
- [@video@Should You Use Open Source Large Language Models?](https://www.youtube.com/watch?v=y9k-U9AuDeM)

@ -1 +1,3 @@
# Performing Similarity Search
In a similarity search, the process begins by converting the user’s query (such as a piece of text or an image) into an embedding—a vector representation that captures the query’s semantic meaning. This embedding is generated using a pre-trained model, such as BERT for text or a neural network for images. Once the query is converted into a vector, it is compared to the embeddings stored in the vector database.

@ -1 +1,9 @@
# Pinecone
Pinecone is a managed vector database designed for efficient similarity search and real-time retrieval of high-dimensional data, such as embeddings. It allows developers to store, index, and query vector representations, making it easy to build applications like recommendation systems, semantic search, and AI-driven content discovery. Pinecone is scalable, handles large datasets, and provides fast, low-latency searches using optimized indexing techniques.
Learn more from the following resources:
- [@official@Pinecone Website](https://www.pinecone.io)
- [@article@Everything you need to know about Pinecone](https://www.packtpub.com/article-hub/everything-you-need-to-know-about-pinecone-a-vector-database?srsltid=AfmBOorXsy9WImpULoLjd-42ERvTzj3pQb7C2EFgamWlRobyGJVZKKdz)
- [@video@Introducing Pinecone Serverless](https://www.youtube.com/watch?v=iCuR6ihHQgc)

@ -1,8 +1,8 @@
# Popular Open Source Models
Notable open-source examples are `BERT`, developed by Google, which has become a foundational model for natural language processing tasks; `BLOOM`, a multilingual model with 176 billion parameters created through a collaborative project by Hugging Face; and `Falcon 180B`, known for its impressive performance in NLP tasks.
Open-source large language models (LLMs) are models whose source code and architecture are publicly available for use, modification, and distribution. They are built using machine learning algorithms that process and generate human-like text, and being open-source, they promote transparency, innovation, and community collaboration in their development and application.
Learn more from the following resources:
- [@article@Top Open Source Models](https://www.datacamp.com/blog/top-open-source-llms)
- [@article@Mark on Open Source AI](https://about.fb.com/news/2024/07/open-source-ai-is-the-path-forward/)
- [@article@The best large language models (LLMs) in 2024](https://zapier.com/blog/best-llm/)
- [@article@8 Top Open-Source LLMs for 2024 and Their Uses](https://www.datacamp.com/blog/top-open-source-llms)

@ -1 +1,5 @@
# Pricing Considerations
The pricing for the OpenAI Embedding API is based on the number of tokens processed and the specific embedding model used. Costs are determined by the total tokens needed to generate embeddings, so longer texts will result in higher charges. To manage costs, developers can optimize by shortening inputs or batching requests. Additionally, selecting the right embedding model for your performance and budget requirements, along with monitoring token usage, can help control expenses.
- [@official@OpenAI API Pricing](https://openai.com/api/pricing/)

@ -1 +1,5 @@
# Pricing Considerations
When using the OpenAI API, pricing considerations depend on factors like the model type, usage volume, and specific features utilized. Different models, such as GPT-3.5, GPT-4, or DALL-E, have varying cost structures based on the complexity of the model and the number of tokens processed (inputs and outputs). For cost efficiency, you should optimize prompt design, monitor usage, and consider rate limits or volume discounts offered by OpenAI for high usage.
- [@official@OpenAI API Pricing](https://openai.com/api/pricing/)

@ -1 +1,8 @@
# Prompt Engineering
Prompt engineering is the process of crafting effective inputs (prompts) to guide AI models, like GPT, to generate desired outputs. It involves strategically designing prompts to optimize the model’s performance by providing clear instructions, context, and examples. Effective prompt engineering can improve the quality, relevance, and accuracy of responses, making it essential for applications like chatbots, content generation, and automated support. By refining prompts, developers can better control the model’s behavior, reduce ambiguity, and achieve more consistent results, enhancing the overall effectiveness of AI-driven systems.
Learn more from the following resources:
- [@roadmap@Prompt Engineering Roadmap](https://roadmap.sh/prompt-engineering)
- [@video@What is Prompt Engineering?](https://www.youtube.com/watch?v=nf1e-55KKbg)

@ -1 +1,8 @@
# Prompt Injection Attacks
Prompt injection attacks are a type of security vulnerability where malicious inputs are crafted to manipulate or exploit AI models, like language models, to produce unintended or harmful outputs. These attacks involve injecting deceptive or adversarial content into the prompt to bypass filters, extract confidential information, or make the model respond in ways it shouldn't. For instance, a prompt injection could trick a model into revealing sensitive data or generating inappropriate responses by altering its expected behavior.
Learn more from the following resources:
- [@article@Prompt Injection in LLMs](https://www.promptingguide.ai/prompts/adversarial-prompting/prompt-injection)
- [@article@What is a prompt injection attack?](https://www.wiz.io/academy/prompt-injection-attack)

@ -1 +1,8 @@
# Purpose and Functionality
A vector database is designed to store, manage, and retrieve high-dimensional vectors (embeddings) generated by AI models. Its primary purpose is to perform fast and efficient similarity searches, enabling applications to find data points that are semantically or visually similar to a given query. Unlike traditional databases, which handle structured data, vector databases excel at managing unstructured data like text, images, and audio by converting them into dense vector representations. They use indexing techniques, such as approximate nearest neighbor (ANN) algorithms, to quickly search large datasets and return relevant results. Vector databases are essential for applications like recommendation systems, semantic search, and content discovery, where understanding and retrieving similar items is crucial.
Learn more from the following resources:
- [@article@What is a Vector Database? Top 12 Use Cases](https://lakefs.io/blog/what-is-vector-databases/)
- [@article@Vector Databases: Intro, Use Cases](https://www.v7labs.com/blog/vector-databases)

@ -1 +1,9 @@
# Qdrant
Qdrant is an open-source vector database designed for efficient similarity search and real-time data retrieval. It specializes in storing and indexing high-dimensional vectors (embeddings) to enable fast and accurate searches across large datasets. Qdrant is particularly suited for applications like recommendation systems, semantic search, and AI-driven content discovery, where finding similar items quickly is essential. It supports advanced filtering, scalable indexing, and real-time updates, making it easy to integrate into machine learning workflows.
Learn more from the following resources:
- [@official@Qdrant Website](https://qdrant.tech/)
- [@opensource@Qdrant on GitHub](https://github.com/qdrant/qdrant)
- [@video@Getting started with Qdrant](https://www.youtube.com/watch?v=LRcZ9pbGnno)

@ -1 +1,3 @@
# RAG & Implementation
Retrieval-Augmented Generation (RAG) combines information retrieval with language generation to produce more accurate, context-aware responses. It uses two components: a retriever, which searches a database to find relevant information, and a generator, which crafts a response based on the retrieved data. Implementing RAG involves using a retrieval model (e.g., embeddings and vector search) alongside a generative language model (like GPT). The process starts by converting a query into embeddings, retrieving relevant documents from a vector database, and feeding them to the language model, which then generates a coherent, informed response. This approach grounds outputs in real-world data, resulting in more reliable and detailed answers.

@ -1 +1,9 @@
# RAG Usecases
Retrieval-Augmented Generation (RAG) enhances applications like chatbots, customer support, and content summarization by combining information retrieval with language generation. It retrieves relevant data from a knowledge base and uses it to generate accurate, context-aware responses, making it ideal for tasks such as question answering, document generation, and semantic search. RAG’s ability to ground outputs in real-world information leads to more reliable and informative results, improving user experience across various domains.
Learn more from the following resources:
- [@article@Retrieval augmented generation use cases: Transforming data into insights](https://www.glean.com/blog/retrieval-augmented-generation-use-cases)
- [@article@Retrieval Augmented Generation (RAG) – 5 Use Cases](https://theblue.ai/blog/rag-news/)
- [@video@Introduction to RAG](https://www.youtube.com/watch?v=LmiFeXH-kq8&list=PL-pTHQz4RcBbz78Z5QXsZhe9rHuCs1Jw-)

@ -1 +1,9 @@
# RAG vs Fine-tuning
RAG (Retrieval-Augmented Generation) and fine-tuning are two approaches to enhancing language models, but they differ in methodology and use cases. Fine-tuning involves training a pre-trained model on a specific dataset to adapt it to a particular task, making it more accurate for that context but limited to the knowledge present in the training data. RAG, on the other hand, combines real-time information retrieval with generation, enabling the model to access up-to-date external data and produce contextually relevant responses. While fine-tuning is ideal for specialized, static tasks, RAG is better suited for dynamic tasks that require real-time, fact-based responses.
Learn more from the following resources:
- [@article@RAG vs Fine Tuning: How to Choose the Right Method](https://www.montecarlodata.com/blog-rag-vs-fine-tuning/)
- [@article@RAG vs Finetuning — Which Is the Best Tool to Boost Your LLM Application?](https://towardsdatascience.com/rag-vs-finetuning-which-is-the-best-tool-to-boost-your-llm-application-94654b1eaba7)
- [@video@RAG vs Fine-tuning](https://www.youtube.com/watch?v=00Q0G84kq3M)

@ -1 +1,9 @@
# RAG
Retrieval-Augmented Generation (RAG) is an AI approach that combines information retrieval with language generation to create more accurate, contextually relevant outputs. It works by first retrieving relevant data from a knowledge base or external source, then using a language model to generate a response based on that information. This method enhances the accuracy of generative models by grounding their outputs in real-world data, making RAG ideal for tasks like question answering, summarization, and chatbots that require reliable, up-to-date information.
Learn more from the following resources:
- [@article@What is Retrieval Augmented Generation (RAG)?](https://www.datacamp.com/blog/what-is-retrieval-augmented-generation-rag)
- [@article@What is Retrieval-Augmented Generation? Google](https://cloud.google.com/use-cases/retrieval-augmented-generation)
- [@video@What is Retrieval-Augmented Generation? IBM](https://www.youtube.com/watch?v=T-D1OfcDW1M)

@ -1 +1,8 @@
# ReAct Prompting
ReAct prompting is a technique that combines reasoning and action by guiding language models to think through a problem step-by-step and then take specific actions based on the reasoning. It encourages the model to break down tasks into logical steps (reasoning) and perform operations, such as calling APIs or retrieving information (actions), to reach a solution. This approach helps in scenarios where the model needs to process complex queries, interact with external systems, or handle tasks requiring a sequence of actions, improving the model's ability to provide accurate and context-aware responses.
Learn more from the following resources:
- [@article@ReAct Prompting](https://www.promptingguide.ai/techniques/react)
- [@article@ReAct Prompting: How We Prompt for High-Quality Results from LLMs](https://www.width.ai/post/react-prompting)

@ -1 +1,8 @@
# Recommendation Systems
In the context of embeddings, recommendation systems use vector representations to capture similarities between items, such as products or content. By converting items and user preferences into embeddings, these systems can measure how closely related different items are based on vector proximity, allowing them to recommend similar products or content based on a user's past interactions. This approach improves recommendation accuracy and efficiency by enabling meaningful, scalable comparisons of complex data.
Learn more from the following resources:
- [@article@What role does AI play in recommendation systems and engines?](https://www.algolia.com/blog/ai/what-role-does-ai-play-in-recommendation-systems-and-engines/)
- [@article@What is a recommendation engine?](https://www.ibm.com/think/topics/recommendation-engine)

@ -1 +1,8 @@
# Replicate
Replicate is a platform that allows developers to run machine learning models in the cloud without needing to manage infrastructure. It provides a simple API for deploying and scaling models, making it easy to integrate AI capabilities like image generation, text processing, and more into applications. Users can select from a library of pre-trained models or deploy their own, with the platform handling tasks like scaling, monitoring, and versioning.
Learn more from the following resources:
- [@official@Replicate Website](https://replicate.com/)
- [@video@Replicate.com Beginners Tutorial](https://www.youtube.com/watch?v=y0_GE5ErqY8)

@ -1 +1,8 @@
# Retrieval Process
The retrieval process in Retrieval-Augmented Generation (RAG) involves finding relevant information from a large dataset or knowledge base to support the generation of accurate, context-aware responses. When a query is received, the system first converts it into a vector (embedding) and uses this vector to search a database of pre-indexed embeddings, identifying the most similar or relevant data points. Techniques like approximate nearest neighbor (ANN) search are often used to speed up this process.
Learn more from the following resources:
- [@article@What is Retrieval-Augmented Generation (RAG)?](https://cloud.google.com/use-cases/retrieval-augmented-generation)
- [@article@What Is Retrieval-Augmented Generation, aka RAG?](https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/)

@ -1 +1,8 @@
# Robust prompt engineering
Robust prompt engineering involves carefully crafting inputs to guide AI models toward producing accurate, relevant, and reliable outputs. It focuses on minimizing ambiguity and maximizing clarity by providing specific instructions, examples, or structured formats. Effective prompts anticipate potential issues, such as misinterpretation or inappropriate responses, and address them through testing and refinement. This approach enhances the consistency and quality of the model's behavior, making it especially useful for complex tasks like multi-step reasoning, content generation, and interactive systems.
Learn more from the following resources:
- [@article@Building Robust Prompt Engineering Capability](https://aimresearch.co/product/building-robust-prompt-engineering-capability)
- [@article@Effective Prompt Engineering: A Comprehensive Guide](https://medium.com/@nmurugs/effective-prompt-engineering-a-comprehensive-guide-803160c571ed)

@ -1 +1,8 @@
# Roles and Responsiblities
AI Engineers are responsible for designing, developing, and deploying AI systems that solve real-world problems. Their roles include building machine learning models, implementing data processing pipelines, and integrating AI solutions into existing software or platforms. They work on tasks like data collection, cleaning, and labeling, as well as model training, testing, and optimization to ensure high performance and accuracy. AI Engineers also focus on scaling models for production use, monitoring their performance, and troubleshooting issues. Additionally, they collaborate with data scientists, software developers, and other stakeholders to align AI projects with business goals, ensuring that solutions are reliable, efficient, and ethically sound.
Learn more from the following resources:
- [@article@AI Engineer Job Description](https://resources.workable.com/ai-engineer-job-description)
- [@article@How To Become an AI Engineer (Plus Job Duties and Skills)](https://www.indeed.com/career-advice/finding-a-job/ai-engineer)

@ -1 +1,8 @@
# Security and Privacy Concerns
Security and privacy concerns in AI revolve around the protection of data and the responsible use of models. Key issues include ensuring that sensitive data, such as personal information, is handled securely during collection, processing, and storage, to prevent unauthorized access and breaches. AI models can also inadvertently expose sensitive data if not properly designed, leading to privacy risks through data leakage or misuse. Additionally, there are concerns about model bias, data misuse, and ensuring transparency in how AI decisions are made.
Learn more from the following resources:
- [@article@Examining Privacy Risks in AI Systems](https://transcend.io/blog/ai-and-privacy)
- [@video@AI Is Dangerous, but Not for the Reasons You Think | Sasha Luccioni | TED](https://www.youtube.com/watch?v=eXdVDhOGqoE)

@ -1 +1,8 @@
# Semantic Search
Embeddings are used for semantic search by converting text, such as queries and documents, into high-dimensional vectors that capture the underlying meaning and context, rather than just exact words. These embeddings represent the semantic relationships between words or phrases, allowing the system to understand the query’s intent and retrieve relevant information, even if the exact terms don’t match.
Learn more from the following resources:
- [@article@What is semantic search?](https://www.elastic.co/what-is/semantic-search)
- [@video@What is Semantic Search? Cohere](https://www.youtube.com/watch?v=fFt4kR4ntAA)

@ -1 +1,9 @@
# Sentence Transformers
Sentence Transformers are a type of model designed to generate high-quality embeddings for sentences, allowing them to capture the semantic meaning of text. Unlike traditional word embeddings, which represent individual words, Sentence Transformers understand the context of entire sentences, making them ideal for tasks that require semantic similarity, such as sentence clustering, semantic search, and paraphrase detection. Built on top of transformer models like BERT and RoBERTa, they convert sentences into dense vectors, where similar sentences are placed closer together in vector space.
Learn more from the following resources:
- [@article@What is BERT?](https://h2o.ai/wiki/bert/)
- [@article@SentenceTransformers Documentation](https://sbert.net/)
- [@article@Using Sentence Transformers at Hugging Face](https://huggingface.co/docs/hub/sentence-transformers)

@ -1 +1,9 @@
# Speech-to-Text
In the context of multimodal AI, speech-to-text technology converts spoken language into written text, enabling seamless integration with other data types like images and text. This allows AI systems to process audio input and combine it with visual or textual information, enhancing applications such as virtual assistants, interactive chatbots, and multimedia content analysis. For example, a multimodal AI can transcribe a video’s audio while simultaneously analyzing on-screen visuals and text, providing richer and more context-aware insights.
Learn more from the following resources:
- [@article@What is speech to text? Amazon](https://aws.amazon.com/what-is/speech-to-text/)
- [@article@Turn speech into text using Google AI](https://cloud.google.com/speech-to-text)
- [@article@How is Speech to Text Used? ](https://h2o.ai/wiki/speech-to-text/)

@ -1 +1,8 @@
# Supabase
Supabase Vector is an extension of the Supabase platform, specifically designed for AI and machine learning applications that require vector operations. It leverages PostgreSQL's pgvector extension to provide efficient vector storage and similarity search capabilities. This makes Supabase Vector particularly useful for applications involving embeddings, semantic search, and recommendation systems. With Supabase Vector, developers can store and query high-dimensional vector data alongside regular relational data, all within the same PostgreSQL database.
Learn more from the following resources:
- [@official@Supabase Vector website](https://supabase.com/vector)
- [@video@Supabase Vector: The Postgres Vector database](https://www.youtube.com/watch?v=MDxEXKkxf2Q)

@ -1 +1,8 @@
# Text-to-Speech
In the context of multimodal AI, text-to-speech (TTS) technology converts written text into natural-sounding spoken language, allowing AI systems to communicate verbally. When integrated with other modalities, such as visual or interactive elements, TTS can enhance user experiences in applications like virtual assistants, educational tools, and accessibility features. For example, a multimodal AI could read aloud text from an on-screen document while highlighting relevant sections, or narrate information about objects recognized in an image. By combining TTS with other forms of data processing, multimodal AI creates more engaging, accessible, and interactive systems for users.
Learn more from the following resources:
- [@article@What is Text-to-Speech?](https://aws.amazon.com/polly/what-is-text-to-speech/)
- [@article@From Text to Speech: The Evolution of Synthetic Voices](https://ignitetech.ai/about/blogs/text-speech-evolution-synthetic-voices)

@ -1 +1,8 @@
# Token Counting
Token counting refers to tracking the number of tokens processed during interactions with language models, including both input and output text. Tokens are units of text that can be as short as a single character or as long as a word, and models like GPT process text by splitting it into these tokens. Knowing how many tokens are used is crucial because the API has token limits (e.g., 4,096 for GPT-3 and up to 32,768 for some versions of GPT-4), and costs are typically calculated based on the total number of tokens processed.
Learn more from the following resources:
- [@official@OpenAI Tokenizer Tool](https://platform.openai.com/tokenizer)
- [@article@How to count tokens with Tiktoken](https://cookbook.openai.com/examples/how_to_count_tokens_with_tiktoken)

@ -1 +1,9 @@
# Training
Training refers to the process of teaching a machine learning model to recognize patterns and make predictions by exposing it to a dataset. During training, the model learns from the data by adjusting its internal parameters to minimize errors between its predictions and the actual outcomes. This process involves iteratively feeding the model with input data, comparing its outputs to the correct answers, and refining its predictions through techniques like gradient descent. The goal is to enable the model to generalize well so that it can make accurate predictions on new, unseen data.
Learn more from the following resources:
- [@article@What is Model Training?](https://oden.io/glossary/model-training/)
- [@article@Machine learning model training: What it is and why it’s important](https://domino.ai/blog/what-is-machine-learning-model-training)
- [@article@Training ML Models - Amazon](https://docs.aws.amazon.com/machine-learning/latest/dg/training-ml-models.html)

@ -1,7 +1,8 @@
# Transformers.js
Hugging Face Transformers.js is a JavaScript library that enables developers to run transformer models directly in the browser without requiring a server. It offers a similar API to the original Python library, allowing tasks like sentiment analysis, text generation, and image processing using pre-trained models. By supporting the `pipeline API`, it simplifies the integration of models with preprocessing and postprocessing functionalities.
Transformers.js is a JavaScript library that enables transformer models, like those from Hugging Face, to run directly in the browser or Node.js, without needing cloud services. It supports tasks such as text generation, sentiment analysis, and translation within web apps or server-side scripts. Using WebAssembly (Wasm) and efficient JavaScript, Transformers.js offers powerful NLP capabilities with low latency, enhanced privacy, and offline functionality, making it ideal for real-time, interactive applications where local processing is essential for performance and security.
Learn more from the following resources:
- [@official@Transformers.js](https://huggingface.co/docs/hub/en/transformers-js)
- [@official@Transformers.js on Hugging Face](https://huggingface.co/docs/transformers.js/en/index)
- [@video@How Transformer.js Can Help You Create Smarter AI In Your Browser](https://www.youtube.com/watch?v=MNJHu9zjpqg)

@ -1 +1,9 @@
# Using SDKs Directly
While tools like Langchain and LlamaIndex make it easy to implement RAG, you don't have to necessarily learn and use them. If you know about the different steps of implementing RAG you can simply do it all yourself e.g. do the chunking using @langchain/textsplitters package, create embeddings using any LLM e.g. use OpenAI Embedding API through their SDK, save the embeddings to any vector database e.g. if you are using Supabase Vector DB, you can use their SDK and similarly you can use the relevant SDKs for the rest of the steps as well.
Learn more from the following resources:
- [@official@Langchain Text Splitter Package](https://www.npmjs.com/package/@langchain/textsplitters)
- [@official@OpenAI Embedding API](https://platform.openai.com/docs/guides/embeddings)
- [@official@Supabase AI & Vector Documentation](https://supabase.com/docs/guides/ai)

@ -1 +1,8 @@
# Vector Database
When implementing Retrieval-Augmented Generation (RAG), a vector database is used to store and efficiently retrieve embeddings, which are vector representations of data like documents, images, or other knowledge sources. During the RAG process, when a query is made, the system converts it into an embedding and searches the vector database for the most relevant, similar embeddings (e.g., related documents or snippets). These retrieved pieces of information are then fed to a generative model, which uses them to produce a more accurate, context-aware response.
Learn more from the following resources:
- [@article@How to Implement Graph RAG Using Knowledge Graphs and Vector Databases](https://towardsdatascience.com/how-to-implement-graph-rag-using-knowledge-graphs-and-vector-databases-60bb69a22759)
- [@article@Retrieval Augmented Generation (RAG) with vector databases: Expanding AI Capabilities](https://objectbox.io/retrieval-augmented-generation-rag-with-vector-databases-expanding-ai-capabilities/)

@ -1 +1,8 @@
# Vector Databases
Vector databases are specialized systems designed to store, index, and retrieve high-dimensional vectors, often used as embeddings that represent data like text, images, or audio. Unlike traditional databases that handle structured data, vector databases excel at managing unstructured data by enabling fast similarity searches, where vectors are compared to find those that are most similar to a query. This makes them essential for tasks like semantic search, recommendation systems, and content discovery, where understanding relationships between items is crucial. Vector databases use indexing techniques such as approximate nearest neighbor (ANN) search to efficiently handle large datasets, ensuring quick and accurate retrieval even at scale.
Learn more from the following resources:
- [@article@Vector Databases](https://developers.cloudflare.com/vectorize/reference/what-is-a-vector-database/)
- [@article@What are Vector Databases?](https://www.mongodb.com/resources/basics/databases/vector-databases)

@ -1 +1,3 @@
# Vector Databases
Vector databases are systems specialized in storing, indexing, and retrieving high-dimensional vectors, often used as embeddings for data like text, images, or audio. Unlike traditional databases, they excel at managing unstructured data by enabling fast similarity searches, where vectors are compared to find the closest matches. This makes them essential for tasks like semantic search, recommendation systems, and content discovery. Using techniques like approximate nearest neighbor (ANN) search, vector databases handle large datasets efficiently, ensuring quick and accurate retrieval even at scale.

Some files were not shown because too many files have changed in this diff Show More

Loading…
Cancel
Save