Improve AI engineer content (#7924)

pull/7953/head
Vedansh 1 month ago committed by GitHub
parent 4c6f0a1234
commit a2063c2822
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
  1. 2
      src/data/roadmaps/ai-engineer/content/ai-safety-and-ethics@8ndKHDJgL_gYwaXC7XMer.md
  2. 2
      src/data/roadmaps/ai-engineer/content/anomaly-detection@AglWJ7gb9rTT2rMkstxtk.md
  3. 2
      src/data/roadmaps/ai-engineer/content/capabilities--context-length@vvpYkmycH0_W030E-L12f.md
  4. 2
      src/data/roadmaps/ai-engineer/content/chat-completions-api@_bPTciEA1GT1JwfXim19z.md
  5. 4
      src/data/roadmaps/ai-engineer/content/development-tools@NYge7PNtfI-y6QWefXJ4d.md
  6. 2
      src/data/roadmaps/ai-engineer/content/embedding@grTcbzT7jKk_sIUwOTZTD.md
  7. 6
      src/data/roadmaps/ai-engineer/content/embeddings@XyEp6jnBSpCxMGwALnYfT.md
  8. 3
      src/data/roadmaps/ai-engineer/content/googles-gemini@oe8E6ZIQWuYvHVbYJHUc1.md
  9. 2
      src/data/roadmaps/ai-engineer/content/hugging-face-hub@YLOdOvLXa5Fa7_mmuvKEi.md
  10. 2
      src/data/roadmaps/ai-engineer/content/hugging-face-tasks@YKIPOiSj_FNtg0h8uaSMq.md
  11. 2
      src/data/roadmaps/ai-engineer/content/indexing-embeddings@5TQnO9B4_LTHwqjI7iHB1.md
  12. 2
      src/data/roadmaps/ai-engineer/content/inference-sdk@3kRTzlLNBnXdTsAEXVu_M.md
  13. 1
      src/data/roadmaps/ai-engineer/content/introduction@_hYN0gEi9BL24nptEtXWU.md
  14. 5
      src/data/roadmaps/ai-engineer/content/lancedb@rjaCNT3Li45kwu2gXckke.md
  15. 4
      src/data/roadmaps/ai-engineer/content/langchain@ebXXEhNRROjbbof-Gym4p.md
  16. 2
      src/data/roadmaps/ai-engineer/content/limitations-and-considerations@MXqbQGhNM3xpXlMC2ib_6.md
  17. 2
      src/data/roadmaps/ai-engineer/content/llama-index@d0ontCII8KI8wfP-8Y45R.md
  18. 2
      src/data/roadmaps/ai-engineer/content/llamaindex-for-multimodal-apps@akQTCKuPRRelj2GORqvsh.md
  19. 2
      src/data/roadmaps/ai-engineer/content/llms@wf2BSyUekr1S1q6l8kyq6.md
  20. 2
      src/data/roadmaps/ai-engineer/content/manual-implementation@6xaRB34_g0HGt-y1dGYXR.md
  21. 4
      src/data/roadmaps/ai-engineer/content/open-ai-embeddings-api@l6priWeJhbdUD5tJ7uHyG.md
  22. 4
      src/data/roadmaps/ai-engineer/content/open-vs-closed-source-models@RBwGsq9DngUsl8PrrCbqx.md
  23. 6
      src/data/roadmaps/ai-engineer/content/openai-api@zdeuA4GbdBl2DwKgiOA4G.md
  24. 2
      src/data/roadmaps/ai-engineer/content/openai-vision-api@CRrqa-dBw1LlOwVbrZhjK.md
  25. 4
      src/data/roadmaps/ai-engineer/content/pinecone@_Cf7S1DCvX7p1_3-tP3C3.md
  26. 2
      src/data/roadmaps/ai-engineer/content/pre-trained-models@d7fzv_ft12EopsQdmEsel.md
  27. 4
      src/data/roadmaps/ai-engineer/content/prompt-engineering@Dc15ayFlzqMF24RqIF_-X.md
  28. 2
      src/data/roadmaps/ai-engineer/content/qdrant@DwOAL5mOBgBiw-EQpAzQl.md
  29. 7
      src/data/roadmaps/ai-engineer/content/rag--implementation@lVhWhZGR558O-ljHobxIi.md
  30. 2
      src/data/roadmaps/ai-engineer/content/roles-and-responsiblities@K9EiuFgPBFgeRxY4wxAmb.md
  31. 4
      src/data/roadmaps/ai-engineer/content/semantic-search@eMfcyBxnMY_l_5-8eg6sD.md
  32. 2
      src/data/roadmaps/ai-engineer/content/sentence-transformers@ZV_V6sqOnRodgaw4mzokC.md
  33. 6
      src/data/roadmaps/ai-engineer/content/speech-to-text@jQX10XKd_QM5wdQweEkVJ.md
  34. 2
      src/data/roadmaps/ai-engineer/content/supabase@9kT7EEQsbeD2WDdN9ADx7.md
  35. 2
      src/data/roadmaps/ai-engineer/content/token-counting@FjV3oD7G2Ocq5HhUC17iH.md
  36. 2
      src/data/roadmaps/ai-engineer/content/training@xostGgoaYkqMO28iN2gx8.md
  37. 2
      src/data/roadmaps/ai-engineer/content/using-sdks-directly@WZVW8FQu6LyspSKm1C_sl.md
  38. 2
      src/data/roadmaps/ai-engineer/content/vector-database@zZA1FBhf1y4kCoUZ-hM4H.md
  39. 2
      src/data/roadmaps/ai-engineer/content/vector-databases@LnQ2AatMWpExUHcZhDIPd.md
  40. 7
      src/data/roadmaps/ai-engineer/content/vector-databases@tt9u3oFlsjEMfPyojuqpc.md
  41. 7
      src/data/roadmaps/ai-engineer/content/video-understanding@TxaZCtTCTUfwCxAJ2pmND.md
  42. 4
      src/data/roadmaps/ai-engineer/content/weaviate@VgUnrZGKVjAAO4n_llq5-.md
  43. 5
      src/data/roadmaps/ai-engineer/content/what-are-embeddings@--ig0Ume_BnXb9K2U7HJN.md
  44. 2
      src/data/roadmaps/ai-engineer/content/whisper-api@OTBd6cPUayKaAM-fLWdSt.md
  45. 4
      src/data/roadmaps/ai-engineer/content/writing-prompts@9-5DYeOnKJq9XvEMWP45A.md

@ -5,4 +5,4 @@ AI safety and ethics involve establishing guidelines and best practices to ensur
Learn more from the following resources:
- [@video@What is AI Ethics?](https://www.youtube.com/watch?v=aGwYtUzMQUk)
- [@article@Understanding artificial intelligence ethics and safety](https://www.turing.ac.uk/news/publications/understanding-artificial-intelligence-ethics-and-safety)
- [@article@Understanding Artificial Intelligence Ethics and Safety](https://www.turing.ac.uk/news/publications/understanding-artificial-intelligence-ethics-and-safety)

@ -4,4 +4,4 @@ Anomaly detection with embeddings works by transforming data, such as text, imag
Learn more from the following resources:
- [@article@Anomoly in Embeddings](https://ai.google.dev/gemini-api/tutorials/anomaly_detection)
- [@article@Anomaly in Embeddings](https://ai.google.dev/gemini-api/tutorials/anomaly_detection)

@ -5,4 +5,4 @@ A key aspect of the OpenAI models is their context length, which refers to the a
Learn more from the following resources:
- [@official@Managing Context](https://platform.openai.com/docs/guides/text-generation/managing-context-for-text-generation)
- [@official@Capabilities](https://platform.openai.com/docs/guides/text-generation)
- [@official@Capabilities](https://platform.openai.com/docs/guides/text-generation)

@ -5,4 +5,4 @@ The OpenAI Chat Completions API is a powerful interface that allows developers t
Learn more from the following resources:
- [@official@Create Chat Completions](https://platform.openai.com/docs/api-reference/chat/create)
- [@article@](https://medium.com/the-ai-archives/getting-started-with-openais-chat-completions-api-in-2024-462aae00bf0a)
- [@article@Getting Start with Chat Completions API](https://medium.com/the-ai-archives/getting-started-with-openais-chat-completions-api-in-2024-462aae00bf0a)

@ -2,7 +2,9 @@
AI has given rise to a collection of AI powered development tools of various different varieties. We have IDEs like Cursor that has AI baked into it, live context capturing tools such as Pieces and a variety of brower based tools like V0, Claude and more.
Learn more from the following resources:
- [@official@v0 Website](https://v0.dev)
- [@official@Aider - AI Pair Programming in Terminal](https://github.com/Aider-AI/aider)
- [@official@Replit AI](https://replit.com/ai)
- [@official@Pieces Website](https://pieces.app)
- [@official@Pieces](https://pieces.app)

@ -5,4 +5,4 @@ In Retrieval-Augmented Generation (RAG), embeddings are essential for linking in
Learn more from the following resources:
- [@article@Understanding the role of embeddings in RAG LLMs](https://www.aporia.com/learn/understanding-the-role-of-embeddings-in-rag-llms/)
- [@article@Mastering RAG: How to Select an Embedding Model](https://www.rungalileo.io/blog/mastering-rag-how-to-select-an-embedding-model)
- [@article@Mastering RAG: How to Select an Embedding Model](https://www.rungalileo.io/blog/mastering-rag-how-to-select-an-embedding-model)

@ -4,6 +4,6 @@ Embeddings are dense, continuous vector representations of data, such as words,
Learn more from the following resources:
- [@article@What are embeddings in machine learning?](https://www.cloudflare.com/en-gb/learning/ai/what-are-embeddings/)
- [@article@What is embedding?](https://www.ibm.com/topics/embedding)
- [@video@What are Word Embeddings](https://www.youtube.com/watch?v=wgfSDrqYMJ4)
- [@article@What are Embeddings in Machine Learning?](https://www.cloudflare.com/en-gb/learning/ai/what-are-embeddings/)
- [@article@What is Embedding?](https://www.ibm.com/topics/embedding)
- [@video@What are Word Embeddings](https://www.youtube.com/watch?v=wgfSDrqYMJ4)

@ -4,5 +4,6 @@ Google Gemini is an advanced AI model by Google DeepMind, designed to integrate
Learn more from the following resources:
- [@official@Google Gemini](https://workspace.google.com/solutions/ai/)
- [@official@Google Gemini](https://gemini.google.com/)
- [@official@Google's Gemini Documentation](https://workspace.google.com/solutions/ai/)
- [@video@Welcome to the Gemini era](https://www.youtube.com/watch?v=_fuimO6ErKI)

@ -4,5 +4,5 @@ The Hugging Face Hub is a comprehensive platform that hosts over 900,000 machine
Learn more from the following resources:
- [@official@Documentation](https://huggingface.co/docs/hub/en/index)
- [@official@Hugging Face Documentation](https://huggingface.co/docs/hub/en/index)
- [@course@nlp-official](https://huggingface.co/learn/nlp-course/en/chapter4/1)

@ -1,6 +1,6 @@
# Hugging Face Tasks
Hugging Face supports text classification, named entity recognition, question answering, summarization, and translation. It also extends to multimodal tasks that involve both text and images, such as visual question answering (VQA) and image-text matching. Each task is done by various pre-trained models that can be easily accessed and fine-tuned through the Hugging Face library.
Hugging Face supports text classification, named entity recognition, question answering, summarization, and translation. It also extends to multimodal tasks that involve both text and images, such as visual question answering (VQA) and image-text matching. Each task is done by various pre-trained models that can be easily accessed and fine-tuned through the Hugging Face library.
Learn more from the following resources:

@ -5,4 +5,4 @@ Embeddings are stored in a vector database by first converting data, such as tex
Learn more from the following resources:
- [@article@Indexing & Embeddings](https://docs.llamaindex.ai/en/stable/understanding/indexing/indexing/)
- [@video@Vector Databases simply explained! (Embeddings & Indexes)](https://www.youtube.com/watch?v=dN0lsF2cvm4)
- [@video@Vector Databases Simply Explained! (Embeddings & Indexes)](https://www.youtube.com/watch?v=dN0lsF2cvm4)

@ -1,6 +1,6 @@
# Inference SDK
The Hugging Face Inference SDK is a powerful tool that allows developers to easily integrate and run inference on large language models hosted on the Hugging Face Hub. By using the `InferenceClient`, users can make API calls to various models for tasks such as text generation, image creation, and more. The SDK supports both synchronous and asynchronous operations thus compatible with existing workflows.
The Hugging Face Inference SDK is a powerful tool that allows developers to easily integrate and run inference on large language models hosted on the Hugging Face Hub. By using the `InferenceClient`, users can make API calls to various models for tasks such as text generation, image creation, and more. The SDK supports both synchronous and asynchronous operations thus compatible with existing workflows.
Learn more from the following resources:

@ -5,3 +5,4 @@ AI Engineering is the process of designing and implementing AI systems using pre
Learn more from the following resources:
- [@video@AI vs Machine Learning](https://www.youtube.com/watch?v=4RixMPF4xis)
- [@article@AI Engineering](https://en.wikipedia.org/wiki/Artificial_intelligence_engineering)

@ -4,5 +4,6 @@ LanceDB is a vector database designed for efficient storage, retrieval, and mana
Learn more from the following resources:
- [@official@LanceDB Website](https://lancedb.com/)
- [@opensource@LanceDB on GitHub](https://github.com/lancedb/lancedb)
- [@official@LanceDB](https://lancedb.com/)
- [@official@LanceDB Documentation](https://docs.lancedb.com/enterprise/introduction)
- [@opensource@LanceDB on GitHub](https://github.com/lancedb/lancedb)

@ -4,5 +4,5 @@ LangChain is a development framework that simplifies building applications power
Learn more from the following resources:
- [@official@LangChain Website](https://www.langchain.com/)
- [@video@What is LangChain?](https://www.youtube.com/watch?v=1bUy-1hGZpI)
- [@official@LangChain](https://www.langchain.com/)
- [@video@What is LangChain?](https://www.youtube.com/watch?v=1bUy-1hGZpI)

@ -4,5 +4,5 @@ Pre-trained models, while powerful, come with several limitations and considerat
Learn more from the following resources:
- [@article@Pretrained Topic Models: Advantages and Limitation](https://www.kaggle.com/code/amalsalilan/pretrained-topic-models-advantages-and-limitation)
- [@article@Pre-trained Topic Models: Advantages and Limitation](https://www.kaggle.com/code/amalsalilan/pretrained-topic-models-advantages-and-limitation)
- [@video@Should You Use Open Source Large Language Models?](https://www.youtube.com/watch?v=y9k-U9AuDeM)

@ -4,5 +4,5 @@ LlamaIndex, formerly known as GPT Index, is a tool designed to facilitate the in
Learn more from the following resources:
- [@official@llamaindex Website](https://docs.llamaindex.ai/en/stable/)
- [@official@Llama Index](https://docs.llamaindex.ai/en/stable/)
- [@video@Introduction to LlamaIndex with Python (2024)](https://www.youtube.com/watch?v=cCyYGYyCka4)

@ -4,5 +4,5 @@ LlamaIndex enables multi-modal apps by linking language models (LLMs) to diverse
Learn more from the following resources:
- [@official@LlamaIndex Multy-modal](https://docs.llamaindex.ai/en/stable/use_cases/multimodal/)
- [@official@LlamaIndex Multi-modal](https://docs.llamaindex.ai/en/stable/use_cases/multimodal/)
- [@video@Multi-modal Retrieval Augmented Generation with LlamaIndex](https://www.youtube.com/watch?v=35RlrrgYDyU)

@ -5,5 +5,5 @@ LLMs, or Large Language Models, are advanced AI models trained on vast datasets
Learn more from the following resources:
- [@article@What is a large language model (LLM)?](https://www.cloudflare.com/en-gb/learning/ai/what-is-large-language-model/)
- [@video@How Large Langauge Models Work](https://www.youtube.com/watch?v=5sLYAQS9sWQ)
- [@video@How Large Language Models Work](https://www.youtube.com/watch?v=5sLYAQS9sWQ)
- [@video@Large Language Models (LLMs) - Everything You NEED To Know](https://www.youtube.com/watch?v=osKyvYJ3PRM)

@ -1,6 +1,6 @@
# Manual Implementation
Services like [Open AI functions](https://platform.openai.com/docs/guides/function-calling) and Tools or [Vercel's AI SDK](https://sdk.vercel.ai/docs/foundations/tools) make it really easy to make SDK agents however it is a good idea to learn how these tools work under the hood. You can also create fully custom implementation of agents using by implementing custom loop.
Services like Open AI functions and Tools or Vercel's AI SDK make it really easy to make SDK agents however it is a good idea to learn how these tools work under the hood. You can also create fully custom implementation of agents using by implementing custom loop.
Learn more from the following resources:

@ -3,5 +3,5 @@
The OpenAI Embeddings API allows developers to generate dense vector representations of text, which capture semantic meaning and relationships. These embeddings can be used for various tasks, such as semantic search, recommendation systems, and clustering, by enabling the comparison of text based on similarity in vector space. The API supports easy integration and scalability, making it possible to handle large datasets and perform tasks like finding similar documents, organizing content, or building recommendation engines.
Learn more from the following resources:
- [@offical@OpenAI Embeddings API](https://platform.openai.com/docs/api-reference/embeddings/create)
- [@video@Master OpenAI EMBEDDING API](https://www.youtube.com/watch?v=9oCS-VQupoc)
- [@official@OpenAI Embeddings API](https://platform.openai.com/docs/api-reference/embeddings/create)
- [@video@Master OpenAI Embedding API](https://www.youtube.com/watch?v=9oCS-VQupoc)

@ -4,5 +4,5 @@ Open-source models are freely available for customization and collaboration, pro
Learn more from the following resources:
- [@article@OpenAI vs. open-source LLM](https://ubiops.com/openai-vs-open-source-llm/)
- [@video@AI360 | Open-Source vs Closed-Source LLMs](https://www.youtube.com/watch?v=710PDpuLwOc)
- [@article@OpenAI vs. Open Source LLM](https://ubiops.com/openai-vs-open-source-llm/)
- [@video@Open-Source vs Closed-Source LLMs](https://www.youtube.com/watch?v=710PDpuLwOc)

@ -1,3 +1,7 @@
# OpenAI API
The OpenAI API provides access to powerful AI models like GPT, Codex, DALL-E, and Whisper, enabling developers to integrate capabilities such as text generation, code assistance, image creation, and speech recognition into their applications via a simple, scalable interface.
The OpenAI API provides access to powerful AI models like GPT, Codex, DALL-E, and Whisper, enabling developers to integrate capabilities such as text generation, code assistance, image creation, and speech recognition into their applications via a simple, scalable interface.
Learn more from the following resources:
- [@official@Open AI API](https://openai.com/api/)

@ -5,4 +5,4 @@ The OpenAI Vision API enables models to analyze and understand images, allowing
Learn more from the following resources:
- [@official@Vision](https://platform.openai.com/docs/guides/vision)
- [@video@OpenAI Vision API Crash Course](https://www.youtube.com/watch?v=ZjkS11DSeEk)
- [@video@OpenAI Vision API Crash Course](https://www.youtube.com/watch?v=ZjkS11DSeEk)

@ -4,6 +4,6 @@ Pinecone is a managed vector database designed for efficient similarity search a
Learn more from the following resources:
- [@official@Pinecone Website](https://www.pinecone.io)
- [@official@Pinecone](https://www.pinecone.io)
- [@article@Everything you need to know about Pinecone](https://www.packtpub.com/article-hub/everything-you-need-to-know-about-pinecone-a-vector-database?srsltid=AfmBOorXsy9WImpULoLjd-42ERvTzj3pQb7C2EFgamWlRobyGJVZKKdz)
- [@video@Introducing Pinecone Serverless](https://www.youtube.com/watch?v=iCuR6ihHQgc)
- [@video@Introducing Pinecone Serverless](https://www.youtube.com/watch?v=iCuR6ihHQgc)

@ -4,4 +4,4 @@ Pre-trained models are Machine Learning (ML) models that have been previously tr
Visit the following resources to learn more:
- [@article@Pre-trained models: Past, present and future](https://www.sciencedirect.com/science/article/pii/S2666651021000231)
- [@article@Pre-trained Models: Past, Present and Future](https://www.sciencedirect.com/science/article/pii/S2666651021000231)

@ -4,5 +4,5 @@ Prompt engineering is the process of crafting effective inputs (prompts) to guid
Learn more from the following resources:
- [@roadmap@Prompt Engineering Roadmap](https://roadmap.sh/prompt-engineering)
- [@video@What is Prompt Engineering?](https://www.youtube.com/watch?v=nf1e-55KKbg)
- [@roadmap@Visit DedicatedPrompt Engineering Roadmap](https://roadmap.sh/prompt-engineering)
- [@video@What is Prompt Engineering?](https://www.youtube.com/watch?v=nf1e-55KKbg)

@ -4,6 +4,6 @@ Qdrant is an open-source vector database designed for efficient similarity searc
Learn more from the following resources:
- [@official@Qdrant Website](https://qdrant.tech/)
- [@official@Qdrant](https://qdrant.tech/)
- [@opensource@Qdrant on GitHub](https://github.com/qdrant/qdrant)
- [@video@Getting started with Qdrant](https://www.youtube.com/watch?v=LRcZ9pbGnno)

@ -1,3 +1,8 @@
# RAG & Implementation
Retrieval-Augmented Generation (RAG) combines information retrieval with language generation to produce more accurate, context-aware responses. It uses two components: a retriever, which searches a database to find relevant information, and a generator, which crafts a response based on the retrieved data. Implementing RAG involves using a retrieval model (e.g., embeddings and vector search) alongside a generative language model (like GPT). The process starts by converting a query into embeddings, retrieving relevant documents from a vector database, and feeding them to the language model, which then generates a coherent, informed response. This approach grounds outputs in real-world data, resulting in more reliable and detailed answers.
Retrieval-Augmented Generation (RAG) combines information retrieval with language generation to produce more accurate, context-aware responses. It uses two components: a retriever, which searches a database to find relevant information, and a generator, which crafts a response based on the retrieved data. Implementing RAG involves using a retrieval model (e.g., embeddings and vector search) alongside a generative language model (like GPT). The process starts by converting a query into embeddings, retrieving relevant documents from a vector database, and feeding them to the language model, which then generates a coherent, informed response. This approach grounds outputs in real-world data, resulting in more reliable and detailed answers.
Learn more from the following resources:
- [@article@What is RAG?](https://aws.amazon.com/what-is/retrieval-augmented-generation/)
- [@video@What is Retrieval-Augmented Generation? IBM](https://www.youtube.com/watch?v=T-D1OfcDW1M)

@ -1,4 +1,4 @@
# Roles and Responsiblities
# Roles and Responsibilities
AI Engineers are responsible for designing, developing, and deploying AI systems that solve real-world problems. Their roles include building machine learning models, implementing data processing pipelines, and integrating AI solutions into existing software or platforms. They work on tasks like data collection, cleaning, and labeling, as well as model training, testing, and optimization to ensure high performance and accuracy. AI Engineers also focus on scaling models for production use, monitoring their performance, and troubleshooting issues. Additionally, they collaborate with data scientists, software developers, and other stakeholders to align AI projects with business goals, ensuring that solutions are reliable, efficient, and ethically sound.

@ -4,5 +4,5 @@ Embeddings are used for semantic search by converting text, such as queries and
Learn more from the following resources:
- [@article@What is semantic search?](https://www.elastic.co/what-is/semantic-search)
- [@video@What is Semantic Search? Cohere](https://www.youtube.com/watch?v=fFt4kR4ntAA)
- [@article@What is Semantic Search?](https://www.elastic.co/what-is/semantic-search)
- [@video@What is Semantic Search? - Cohere](https://www.youtube.com/watch?v=fFt4kR4ntAA)

@ -6,4 +6,4 @@ Learn more from the following resources:
- [@article@What is BERT?](https://h2o.ai/wiki/bert/)
- [@article@SentenceTransformers Documentation](https://sbert.net/)
- [@article@Using Sentence Transformers at Hugging Face](https://huggingface.co/docs/hub/sentence-transformers)
- [@article@Using Sentence Transformers at Hugging Face](https://huggingface.co/docs/hub/sentence-transformers)

@ -4,6 +4,6 @@ In the context of multimodal AI, speech-to-text technology converts spoken langu
Learn more from the following resources:
- [@article@What is speech to text? Amazon](https://aws.amazon.com/what-is/speech-to-text/)
- [@article@Turn speech into text using Google AI](https://cloud.google.com/speech-to-text)
- [@article@How is Speech to Text Used? ](https://h2o.ai/wiki/speech-to-text/)
- [@article@What is Speech to Text?](https://aws.amazon.com/what-is/speech-to-text/)
- [@article@Turn Speech into Text using Google AI](https://cloud.google.com/speech-to-text)
- [@article@How is Speech to Text Used?](https://h2o.ai/wiki/speech-to-text/)

@ -4,5 +4,5 @@ Supabase Vector is an extension of the Supabase platform, specifically designed
Learn more from the following resources:
- [@official@Supabase Vector website](https://supabase.com/vector)
- [@official@Supabase Vector](https://supabase.com/vector)
- [@video@Supabase Vector: The Postgres Vector database](https://www.youtube.com/watch?v=MDxEXKkxf2Q)

@ -5,4 +5,4 @@ Token counting refers to tracking the number of tokens processed during interact
Learn more from the following resources:
- [@official@OpenAI Tokenizer Tool](https://platform.openai.com/tokenizer)
- [@article@How to count tokens with Tiktoken](https://cookbook.openai.com/examples/how_to_count_tokens_with_tiktoken)
- [@article@How to count tokens with Tiktoken](https://cookbook.openai.com/examples/how_to_count_tokens_with_tiktoken)

@ -6,4 +6,4 @@ Learn more from the following resources:
- [@article@What is Model Training?](https://oden.io/glossary/model-training/)
- [@article@Machine learning model training: What it is and why it’s important](https://domino.ai/blog/what-is-machine-learning-model-training)
- [@article@Training ML Models - Amazon](https://docs.aws.amazon.com/machine-learning/latest/dg/training-ml-models.html)
- [@article@Training ML Models - Amazon](https://docs.aws.amazon.com/machine-learning/latest/dg/training-ml-models.html)

@ -1,6 +1,6 @@
# Using SDKs Directly
While tools like Langchain and LlamaIndex make it easy to implement RAG, you don't have to necessarily learn and use them. If you know about the different steps of implementing RAG you can simply do it all yourself e.g. do the chunking using @langchain/textsplitters package, create embeddings using any LLM e.g. use OpenAI Embedding API through their SDK, save the embeddings to any vector database e.g. if you are using Supabase Vector DB, you can use their SDK and similarly you can use the relevant SDKs for the rest of the steps as well.
While tools like Langchain and LlamaIndex make it easy to implement RAG, you don't have to necessarily learn and use them. If you know about the different steps of implementing RAG you can simply do it all yourself e.g. do the chunking using `@langchain/textsplitters` package, create embeddings using any LLM e.g. use OpenAI Embedding API through their SDK, save the embeddings to any vector database e.g. if you are using Supabase Vector DB, you can use their SDK and similarly you can use the relevant SDKs for the rest of the steps as well.
Learn more from the following resources:

@ -5,4 +5,4 @@ When implementing Retrieval-Augmented Generation (RAG), a vector database is use
Learn more from the following resources:
- [@article@How to Implement Graph RAG Using Knowledge Graphs and Vector Databases](https://towardsdatascience.com/how-to-implement-graph-rag-using-knowledge-graphs-and-vector-databases-60bb69a22759)
- [@article@Retrieval Augmented Generation (RAG) with vector databases: Expanding AI Capabilities](https://objectbox.io/retrieval-augmented-generation-rag-with-vector-databases-expanding-ai-capabilities/)
- [@article@Retrieval Augmented Generation (RAG) with vector databases: Expanding AI Capabilities](https://objectbox.io/retrieval-augmented-generation-rag-with-vector-databases-expanding-ai-capabilities/)

@ -5,4 +5,4 @@ Vector databases are specialized systems designed to store, index, and retrieve
Learn more from the following resources:
- [@article@Vector Databases](https://developers.cloudflare.com/vectorize/reference/what-is-a-vector-database/)
- [@article@What are Vector Databases?](https://www.mongodb.com/resources/basics/databases/vector-databases)
- [@article@What are Vector Databases?](https://www.mongodb.com/resources/basics/databases/vector-databases)

@ -1,3 +1,8 @@
# Vector Databases
Vector databases are systems specialized in storing, indexing, and retrieving high-dimensional vectors, often used as embeddings for data like text, images, or audio. Unlike traditional databases, they excel at managing unstructured data by enabling fast similarity searches, where vectors are compared to find the closest matches. This makes them essential for tasks like semantic search, recommendation systems, and content discovery. Using techniques like approximate nearest neighbor (ANN) search, vector databases handle large datasets efficiently, ensuring quick and accurate retrieval even at scale.
Vector databases are systems specialized in storing, indexing, and retrieving high-dimensional vectors, often used as embeddings for data like text, images, or audio. Unlike traditional databases, they excel at managing unstructured data by enabling fast similarity searches, where vectors are compared to find the closest matches. This makes them essential for tasks like semantic search, recommendation systems, and content discovery. Using techniques like approximate nearest neighbor (ANN) search, vector databases handle large datasets efficiently, ensuring quick and accurate retrieval even at scale.
Learn more from the following resources:
- [@article@Vector Databases](https://developers.cloudflare.com/vectorize/reference/what-is-a-vector-database/)
- [@article@What are Vector Databases?](https://www.mongodb.com/resources/basics/databases/vector-databases)

@ -1,3 +1,8 @@
# Video Understanding
Video understanding with multimodal AI involves analyzing and interpreting both visual and audio content to provide a more comprehensive understanding of videos. Common use cases include video summarization, where AI extracts key scenes and generates summaries; content moderation, where the system detects inappropriate visuals or audio; and video indexing for easier search and retrieval of specific moments within a video. Other applications include enhancing video-based recommendations, security surveillance, and interactive entertainment, where video and audio are processed together for real-time user interaction.
Video understanding with multimodal AI involves analyzing and interpreting both visual and audio content to provide a more comprehensive understanding of videos. Common use cases include video summarization, where AI extracts key scenes and generates summaries; content moderation, where the system detects inappropriate visuals or audio; and video indexing for easier search and retrieval of specific moments within a video. Other applications include enhancing video-based recommendations, security surveillance, and interactive entertainment, where video and audio are processed together for real-time user interaction.
Learn more from the following resources:
- [@article@Video Understanding](https://dl.acm.org/doi/10.1145/3503161.3551600)
- [@opensource@Awesome LLM for Video Understanding](https://github.com/yunlong10/Awesome-LLMs-for-Video-Understanding)

@ -4,5 +4,5 @@ Weaviate is an open-source vector database that allows users to store, search, a
Learn more from the following resources:
- [@official@Weaviate Website](https://weaviate.io/)
- [@video@Advanced AI Agents with RAG](https://www.youtube.com/watch?v=UoowC-hsaf0&list=PLTL2JUbrY6tVmVxY12e6vRDmY-maAXzR1)
- [@official@Weaviate](https://weaviate.io/)
- [@video@Advanced AI Agents with RAG](https://www.youtube.com/watch?v=UoowC-hsaf0&list=PLTL2JUbrY6tVmVxY12e6vRDmY-maAXzR1)

@ -1,3 +1,8 @@
# What are Embeddings
Embeddings are dense, numerical vector representations of data, such as words, sentences, images, or audio, that capture their semantic meaning and relationships. By converting data into fixed-length vectors, embeddings allow machine learning models to process and understand the data more effectively. For example, word embeddings represent similar words with similar vectors, enabling tasks like semantic search, recommendation systems, and clustering. Embeddings make it easier to compare, search, and analyze complex, unstructured data by mapping similar items close together in a high-dimensional space.
Visit the following resources to learn more:
- [@official@Introducing Text and Code Embeddings](https://openai.com/index/introducing-text-and-code-embeddings/)
- [@article@What are Embeddings](https://www.cloudflare.com/learning/ai/what-are-embeddings/)

@ -5,4 +5,4 @@ The Whisper API by OpenAI enables developers to integrate speech-to-text capabil
Learn more from the following resources:
- [@official@OpenAI Whisper](https://openai.com/index/whisper/)
- [@opensource@Whisper on GitHub](https://github.com/openai/whisper)
- [@opensource@Whisper on GitHub](https://github.com/openai/whisper)

@ -4,6 +4,6 @@ Prompts for the OpenAI API are carefully crafted inputs designed to guide the la
Learn more from the following resources:
- [@roadmap@](https://roadmap.sh/prompt-engineering)
- [@roadmap@Visit Dedicated Prompt Engineering Roadmap](https://roadmap.sh/prompt-engineering)
- [@article@How to write AI prompts](https://www.descript.com/blog/article/how-to-write-ai-prompts)
- [@article@Prompt Engineering Guide](https://www.promptingguide.ai/)
- [@article@Prompt Engineering Guide](https://www.promptingguide.ai/)

Loading…
Cancel
Save