How to Integrate OpenAI With Azure Cognitive Search (Vector Search)

 

Introduction

In this article, I’ll explain how you can use Azure Cognitive Search with OpenAI, but before that, let’s understand the role of each of these major components.

Azure Cognitive Search

This is an offering from Azure where we can store our data as high-dimensional vectors where each vector has a certain number of dimensions. This search service has the capability to perform both text and vector searches, but in this article, I will be focusing only on the vector search capability.

OpenAI

OpenAI provides APIs that can be used to create interactive chatbots and virtual assistants that can carry out conversations in a natural manner. It works with prompts of various sizes based on the selected AI model.

As OpenAI is a prompt-based system, the user is charged based on the number of tokens, which means we have to be careful while sending data to the OpenAI API. We cannot pass huge text files in a single shot just to get the answer to one question. Rather, it is suggested to pass only the relevant content to the API, and to get that relevant content, we are using Azure Cognitive Search.

With this much background, let’s jump on to the key steps and understand how both of these systems can talk to each other.

Create An Instance Of Cognitive Search

The very first thing we need to do is to create an instance of Cognitive Search on the Azure portal, and that can be done by searching for Cognitive Search in the search bar and clicking on the Create button. Make sure that you have logged on to the Azure portal with an active subscription in order to do this step.

Azure AI

Follow the guided dialog, and you will get your instance created. Once an instance is created, you need to make a note of the endpoint and a key, which can be found by clicking on the highlighted boxes below.

Sementic Search

Get An OpenAI API Key

To get the OpenAI key, you need to go to https://openai.com/, login, and then grab the keys using the highlighted way.

OpenAI

Import Required Packages

Do install the dependent libraries and import the below packages.

import openai
import os
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.models import Vector  
from azure.search.documents.indexes.models import (
    ComplexField,
    CorsOptions,
    SearchIndex,
    ScoringProfile,
    SimpleField,
    SearchField,
    SearchableField,
    VectorSearch,
    HnswVectorSearchAlgorithmConfiguration
)

Create Vector Configuration

Next, we need to define the configuration for our search, and for that, we are using the new library for the nearest neighbor search.

If you want to read more about the parameters and their usage, feel free to refer to the References section of this article.

vector_search = VectorSearch(
    algorithm_configurations=[
        HnswVectorSearchAlgorithmConfiguration(
            name="my-vector-config",
            kind="hnsw",
            parameters={
                "m": 4,
                "efConstruction": 400,
                "efSearch": 500,
                "metric": "cosine"
            }
        )
    ]
)

Create Search Index Client

This is the most important part of our implementation because this is the place where we define the fields for our database.

In order to make it simple, I’m just tasking three fields — documentId is the key and stores randomly generated ID, content stores the actual text content, and embedding stores the numbers against which we are going to run our query.

client = SearchIndexClient(AZURE_SEARCH_ENDPOINT, YOUR_SEARCH_CREDENTIAL)
# Create the index
index_name = "Index_Name"
fields = [
        SimpleField(name="documentId", type=SearchFieldDataType.String, filterable=True, sortable=True, key=True),     
        SearchableField(name="content", type=SearchFieldDataType.String),        
        SearchField(name="embedding", type=SearchFieldDataType.Collection(SearchFieldDataType.Single), searchable=True, vector_search_dimensions = 1536, vector_search_configuration ="my-vector-config")
    ]
index = SearchIndex(
    name=index_name,
    fields=fields,
    vector_search=vector_search
    )
result = client.create_index(index)

Upload Documents To Index

Next, we need to upload the embeddings of our input documents to a vector database. Here, if your document is too huge, it is recommended to chunk it down to smaller multiple documents. In my case, I already broke my document(example doc taken from Azure Github), and this is what chunks look like.

[Document(page_content='Contoso Electronics Employee Handbook\nThis document contains information generated using a language model (Azure OpenAI). The \ninformation contained in this document is only for demonstration purposes and does not \nreflect the opinions or beliefs of Microsoft. Microsoft makes no representations or \nwarranties of any kind, express or implied, about the completeness, accuracy, reliability, \nsuitability or availability with respect to the information contained in this document. \nAll rights reserved to Microsoft\nContoso Electronics Employee Handbook\nLast Updated: 2023-03-05', metadata={'source': '.\\Docs\\Handbook.txt'}),
 Document(page_content="Contoso Electronics is a leader in the aerospace industry, providing advanced electronic \ncomponents for both commercial and military aircraft. We specialize in creating cutting\x02edge systems that are both reliable and efficient. Our mission is to provide the highest \nquality aircraft components to our customers, while maintaining a commitment to safety \nand excellence. We are proud to have built a strong reputation in the aerospace industry \nand strive to continually improve our products and services. Our experienced team of \nengineers and technicians are dedicated to providing the best products and services to our \ncustomers. With our commitment to excellence, we are sure to remain a leader in the \naerospace industry for years to come.\nOur Mission\nContoso Electronics is a leader in the aerospace industry, providing advanced electronic \ncomponents for both commercial and military aircraft. We specialize in creating cutting\x02edge systems that are both reliable and efficient. Our mission is to provide the highest \nquality aircraft components to our customers, while maintaining a commitment to safety \nand excellence. We are proud to have built a strong reputation in the aerospace industry \nand strive to continually improve our products and services. Our experienced team of \nengineers and technicians are dedicated to providing the best products and services to our \ncustomers. With our commitment to excellence, we are sure to remain a leader in the \naerospace industry for years to come.\nValues\nAt Contoso Electronics, we strive to create an environment that values hard work, \ninnovation, and collaboration. Our core values serve as the foundation for our success, and \nthey guide our employees in how we should act and interact with each other and our \ncustomers.\nCompany Values:\n1. Quality: We strive to provide the highest quality products and services to our customers.\n2. Integrity: We value honesty, respect, and trustworthiness in all our interactions.\n3. Innovation: We encourage creativity and support new ideas and approaches to our \nbusiness.\n4. Teamwork: We believe that by working together, we can achieve greater success.\n5. Respect: We treat all our employees, customers, and partners with respect and dignity.\n6. Excellence: We strive to exceed expectations and provide excellent service.\n7. Accountability: We take responsibility for our actions and hold ourselves and others \naccountable for their performance.\n8. Community: We are committed to making a positive impact in the communities in which \nwe work and live.\nPerformance Reviews\nPerformance Reviews at Contoso Electronics\nAt Contoso Electronics, we strive to ensure our employees are getting the feedback they \nneed to continue growing and developing in their roles. We understand that performance \nreviews are a key part of this process and it is important to us that they are conducted in an \neffective and efficient manner.\nPerformance reviews are conducted annually and are an important part of your career \ndevelopment. During the review, your supervisor will discuss your performance over the \npast year and provide feedback on areas for improvement. They will also provide you with \nan opportunity to discuss your goals and objectives for the upcoming year.\nPerformance reviews are a two-way dialogue between managers and employees. We \nencourage all employees to be honest and open during the review process, as it is an \nimportant opportunity to discuss successes and challenges in the workplace.\nWe aim to provide positive and constructive feedback during performance reviews. This \nfeedback should be used as an opportunity to help employees develop and grow in their \nroles.\nEmployees will receive a written summary of their performance review which will be \ndiscussed during the review session. This written summary will include a rating of the \nemployee’s performance, feedback, and goals and objectives for the upcoming year.\nWe understand that performance reviews can be a stressful process. We are committed to \nmaking sure that all employees feel supported and empowered during the process. We \nencourage all employees to reach out to their managers with any questions or concerns \nthey may have.\nWe look forward to conducting performance reviews with all our employees. They are an \nimportant part of our commitment to helping our employees grow and develop in their \nroles.\nWorkplace Safety\nWelcome to Contoso Electronics! Our goal is to provide a safe and healthy work \nenvironment for our employees and to maintain a safe workplace that is free from \nrecognized hazards. We believe that workplace safety is everyone's responsibility and we \nare committed to providing a safe working environment for all of our employees. \nContoso Electronics' Workplace Safety Program\nAt Contoso Electronics, we have established a comprehensive workplace safety program \nthat is designed to protect our employees from workplace hazards. Our program includes:\n• Hazard Identification and Risk Assessment – We strive to identify and assess potential \nsafety hazards in the workplace and take the necessary steps to reduce or eliminate them.\n• Training – We provide our employees with safety training to ensure that they are aware of \nsafety procedures and protocols.\n• Personal Protective Equipment (PPE) – We provide our employees with the necessary PPE \nto ensure their safety.\n• Emergency Preparedness – We have established procedures and protocols in the event of \nan emergency.\n• Reporting – We encourage our employees to report any safety concerns or incidents to \nour safety department.\n• Inspections – We conduct regular safety inspections to ensure that our workplace is free \nfrom hazards.\n• Record Keeping – We maintain accurate records of all safety incidents, inspections and \ntraining.\nWe believe that our workplace safety program is essential to providing a safe and healthy \nwork environment for our employees. We are committed to providing a safe working \nenvironment and to protecting our employees from workplace hazards. If you have any \nquestions or concerns related to workplace safety, please contact our safety department. \nThank you for being a part of the Contoso Electronics team.\nWorkplace Violence\nWorkplace Violence Prevention Program\nAt Contoso Electronics, we are committed to providing a safe, respectful and healthy \nworkplace for all of our employees. In order to ensure that we maintain this, we have \ndeveloped a comprehensive Workplace Violence Prevention Program.\nPurpose\nThe purpose of this program is to promote a safe and healthy work environment by \npreventing violence, threats, and abuse in the workplace. It is also intended to provide a \nsafe, secure and protected environment for our employees, customers, and visitors.\nDefinition of Workplace Violence\nWorkplace violence is any act of physical aggression, intimidation, or threat of physical \nharm toward another individual in the workplace. This includes but is not limited to \nphysical assault, threats of violence, verbal abuse, intimidation, harassment, bullying, \nstalking, and any other behavior that creates a hostile work environment.\nPrevention and Response\nContoso Electronics is committed to preventing workplace violence and will not tolerate \nany acts of violence, threats, or abuse in the workplace. All employees are expected to \nfollow the company’s zero tolerance policy for workplace violence.\nIf an employee believes that they are in danger or are the victim or witness of workplace \nviolence, they should immediately notify their supervisor or Human Resources \nRepresentative. Employees are also encouraged to report any suspicious activity or \nbehavior to their supervisor or Human Resources Representative.\nIn the event of an incident of workplace violence, Contoso Electronics will respond \npromptly and appropriately. All incidents will be thoroughly investigated and the \nappropriate disciplinary action will be taken.\nTraining and Education\nContoso Electronics will provide regular training and education to all employees on \nworkplace violence prevention and response. This training will include information on \nrecognizing potential signs of workplace violence, strategies for responding to incidents, \nand the company’s zero tolerance policy.\nWe are committed to creating a safe and secure work environment for all of our employees. \nBy following the guidelines outlined in this program, we can ensure that our workplace is \nfree from violence and abuse.\nPrivacy\nPrivacy Policy\nAt Contoso Electronics, we are committed to protecting the privacy and security of our \ncustomers, employees, and partners. We have developed a comprehensive privacy program \nto ensure that we comply with applicable laws, regulations, and industry standards.\nThis policy applies to all Contoso Electronics employees, contractors, and partners.\nCollection and Use of Personal Information\nContoso Electronics collects, stores, and uses personal information for a variety of purposes, \nsuch as to provide services, process orders, respond to customer inquiries, and to provide \nmarketing communications.\nWe may also collect information from third parties, such as our partners and vendors. We \nmay use this information to better understand our customers and improve our services.\nContoso Electronics will not sell or rent your personal information to any third parties.\nData Security and Protection\nContoso Electronics is committed to protecting the security of your personal information. \nWe have implemented physical, technical, and administrative measures to protect your data \nfrom unauthorized access, alteration, or disclosure.\nWe use secure servers and encryption technology to protect data transmitted over the \nInternet.\nAccess to Personal Information\nYou have the right to access, review, and request a copy of your personal information that \nwe have collected and stored. You may also request that we delete or correct any inaccurate \ninformation.\nTo access or make changes to your personal information, please contact the Privacy Officer \nat [email protected].\nChanges to This Policy\nWe may update this policy from time to time to reflect changes in our practices or \napplicable laws. We will notify you of any changes by posting a revised policy on our \nwebsite.\nQuestions or Concerns\nIf you have any questions or concerns about our privacy policies or practices, please contact \nthe Privacy Officer at [email protected].\nWhistleblower Policy\nContoso Electronics Whistleblower Policy\nAt Contoso Electronics, we believe in maintaining a safe and transparent working \nenvironment for all of our team members. To ensure the well-being of the entire \norganization, we have established a Whistleblower Policy. This policy encourages \nemployees to come forth and report any unethical or illegal activities they may witness \nwhile working at Contoso Electronics.\nThis policy applies to all Contoso Electronics employees, contractors, and other third \nparties.\nDefinition:\nA whistleblower is an individual who reports activities that are illegal, unethical, or \notherwise not in accordance with company policy.\nReporting Procedures:\nIf you witness any activity that you believe to be illegal, unethical, or not in accordance with \ncompany policy, it is important that you report it immediately. You can do this by:\n1. Contacting the Human Resources Department.\n2. Emailing the Compliance Officer at [email protected].\n3. Calling the Compliance Hotline at 1-800-555-1212.\nWhen making a report, please provide as much detail as possible. This information should \ninclude:\n1. The time and date of the incident.\n2. Who was involved.\n3. What happened.\n4. Any evidence you may have related to the incident.\nIf you choose to report anonymously, you may do so by calling the Compliance Hotline at 1-\n800-555-1212.\nRetaliation Prohibited:\nRetaliation of any kind is strictly prohibited. Any employee who retaliates against a \nwhistleblower will be subject to disciplinary action, up to and including termination.\nConfidentiality:\nThe identity of the whistleblower will be kept confidential to the extent permitted by law.\nInvestigation:\nAll reported incidents will be investigated promptly and thoroughly.\nThank you for taking the time to read our Whistleblower Policy. We value your commitment \nto ethical and responsible behavior and appreciate your efforts to help us maintain a safe \nand transparent working environment.\nData Security\nData Security at Contoso Electronics\nAt Contoso Electronics, data security is of the utmost importance. We understand that the \nsecurity of our customers’ data is paramount and we are committed to protecting it. We \nhave a comprehensive data security program in place to ensure that all customer data is \nkept secure and confidential.\nData Security Policies:\n• All employees must adhere to data security policies and procedures established by \nContoso Electronics.\n• All customer data must be encrypted when stored or transferred.\n• Access to customer data must be restricted to authorized personnel only.\n• All computers, servers, and other digital devices used to store customer data must be \nprotected with up-to-date anti-virus and security software.\n• All passwords used to access customer data must be complex and regularly updated.\n• All customer data must be backed up regularly and stored securely.\n• All customer data must be destroyed securely when no longer needed.\nData Security Training:\nAll employees must complete data security training at the start of employment and annually \nthereafter. This training will cover topics such as data security policies and procedures, \nencryption, access control, password security, and data backup and destruction.\nData Security Audits:\nContoso Electronics will conduct regular audits of our data security program to ensure that \nit is functioning as intended. Audits will cover topics such as system security, access control, \nand data protection.\nIf you have any questions or concerns about Contoso Electronics’ data security program, \nplease contact our data security team. We are committed to keeping your data secure and \nwe appreciate your continued trust. Thank you for being a valued customer.", metadata={'source': '.\\Docs\\Handbook.txt'})

I generated embeddings and created JSON out of the above two documents. Here is my sample JSON.

Sample JSON

Here is the code for upload.

with open('HandbookContent.json', 'r') as f:  
    documents = json.load(f)  
search_client = SearchClient(endpoint=AZURE_SEARCH_ENDPOINT, index_name=index_name, credential=YOUR_SEARCH_CREDENTIAL)
result = search_client.upload_documents(documents)

Create Prompt

Once all the docs are indexed, we are good to work with our questions. Here, as a first thing, we need to generate the embedding for the question because this is what is going to be compared against the embeddings that are already present in the vector database.

query =  "What information does this handbook has?" 
vector = Vector(value=generate_embeddings(query), k=2, fields="embedding")
  
results = search_client.search(  
    search_text=None,  
    vectors= [vector],
    select=["content"],
)  

input_text = " ";
for result in results: 
    input_text = input_text + result['content'] + " "

In the end, this code collects all the contextual information that is returned by the search.

Make A Call To OpenAI API

This step is going to be very simple. It takes context from the previous step and passes it as a prompt to OpenAI, as shown below.

openai.Completion.create(
  model="gpt-3.5-turbo-instruct",
  prompt= f"Answer the question based on given input text. Input:{input_text}. Question: {query}",
  max_tokens=100,
  temperature=0)

Below is the response. If there is anything, which you feel is unclear then please watch my video here.

<OpenAIObject text_completion id=cmpl-85cbHJllGIDa12o6GiE5TRoLgGARc at 0x21aa15c8fe0> JSON: {
  "id": "cmpl-85cbHJllGIDa12o6GiE5TRoLgGARc",
  "object": "text_completion",
  "created": 1696350711,
  "model": "gpt-3.5-turbo-instruct",
  "choices": [
    {
      "text": "\n\nThis handbook contains information about Contoso Electronics, including their mission, values, performance reviews, workplace safety, workplace violence prevention, privacy policies, whistleblower policy, and data security. It also includes information about the company's products and services, their reputation in the aerospace industry, and their commitment to excellence.",
      "index": 0,
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 2792,
    "completion_tokens": 61,
    "total_tokens": 2853
  }
}


Similar Articles