Once you’ve performed a similarity search in your vector database, you’ll typically want to iterate through the results and display the most relevant items. In this guide, we’ll demonstrate how to loop through the results and display the similarity score for each item.
Example Using Chroma
Python
import chromadb
from sentence_transformers import SentenceTransformer
# Create a Chroma client and collection (as shown in the previous guide)
# Perform a query
query_text = "What is the capital of France?"
query_embedding = embedding_model.encode([query_text])
results = collection.query(
query_embeddings=query_embedding,
n_results=5
)
# Loop through the results and display similarity scores
for result in results["matches"]:
document = result["document"]
similarity_score = result["score"]
print(f"Document: {document}")
print(f"Similarity Score: {similarity_score}")
Explanation
- Iterate through Results: The
results["matches"]
list contains the top results from the query. We iterate through each result using afor
loop. - Access Document: The
document
key in each result contains the actual document text. - Access Similarity Score: The
score
key provides the similarity score between the query and the document. - Display Results: We print the document and its corresponding similarity score to the console.
Customizing Output
You can customize the output to suit your specific needs. For example, you might want to sort the results by similarity score, format the output in a table, or highlight the most relevant keywords within the documents.
Additional Considerations
- Thresholding: If you want to filter out results below a certain similarity threshold, you can check the
score
value and only display results that meet the criteria. - Pagination: For large result sets, you might want to implement pagination to display results in batches.
- Visualization: Consider using visualization techniques to represent the similarity scores graphically, such as a bar chart or scatter plot.
By following these steps and customizing the output, you can effectively loop through the results of your similarity search and present the information in a meaningful way.