Embeddings Are Not Working – AI Engine

Concerning the use of embeddings, you can use the “Queries tab” to ensure they are being used in the context of your chatbot. You should also see a request made from the ada model that is used to fetch the data from the vector database.

Make sure you can find the embedding you are expecting by using the Search mode in the embeddings tab. If nothing appears, make sure your settings for the minimum score are low enough; if it’s too high, no embedding at all might be picked up.

You can use the following filter to ensure that the embeddings are being used. This will write in your logs the embedding that is being added to your chatbot context.

add_filter( "mwai_context_search", 'my_web_search', 10, 3 );

function my_web_search( $context, $query, $options = [] ) {

  if ( !empty( $context ) ) {
    error_log( "Context: " . print_r( $context, true ) );
    return $context;
  } else {
		error_log( "Empty Context." );
	}

  return null;
}

🗒️ Please ensure that there is adequate space within your context for your embeddings. In the settings, you will find the context max tokens parameter, which determines the number of tokens your embeddings will occupy. If this setting is empty, it will be considered as 0, which means that the embeddings will occupy 0 tokens and, consequently, won't be used.

Sometimes your query might not be precise enough for a vector to be returned. Be sure to use terms that match what you are looking for. 😊

If you are using Pinecone, you can log in to your account and look inside your index directly to see how many vectors are inserted in each namespace. It should be updating in real-time, so you can check the difference with what's showing up on AI Engine.

🗒️ If you are using the Pinecone free tier services, please note that inactivity of 7 days will result in the termination of your Pinecone Project. To ensure it still exists, please connect to your account.

All the embeddings are stored locally in the wp_mwai_vectors table. You should be able to view them here and perform manual operations if needed. It could be a faster approach to delete all embeddings at once instead of using the AI Engine embedding table. If you are unsure about what you are doing, it is recommended to create a backup using a reliable tool like the excellent BlogVault.

If everything seems to work fine but you are still not getting the answer you want, this might be because of the behavior of the model you are using. Sometimes, contextual data is ignored if the model judges it irrelevant or if the instructions you are using are contradictory and/or unrelated. For instance, the GPT model will most likely always respond that it can't use actual data if your prompt is asking something related to that (thinking it will get this data from the embedding).

Orphan Vectors

If some "N/A" or orphan embeddings are created, it may be due to discrepancies between your database and the Pinecone registry (learn more at the bottom of this documentaion). You should be able to delete them manually until they are all gone. If it keeps happening, look inside your PHP logs for any errors. If there are too many of them, please refer to the last part of this documentation. Try to change or refresh your index and namespace.

The Pinecone database contains only vectorized data, which means that by itself, it's essentially a collection of numbers that can't be directly used. This is why the AI Engine also utilizes your database to store the corresponding vectorized values in a textual format. You can think of the AI Engine's database as a translation book for your Pinecone Database. Therefore, it's important to ensure that both databases have the same number of entries.

When there's a discrepancy between the two databases, you might encounter orphaned vectors appearing in the 'Embedding' table. This indicates that a result was returned from a request to Pinecone, but no corresponding textual value (translation) was found in your database. In such cases, you can either add the missing value or remove the orphaned vector to maintain a clean dataset. In some instances, Pinecone may store the textual value in metadata, and the AI Engine will automatically use it to create a valid 'OK' embedding.

To initiate this process, you can use the 'Sync Pull' option, which requests Pinecone for every vector saved, allowing you to either clean them or fill them with the corresponding textual data.

CURL / SSL Related errors

SSL related errors are related to your hosting service. On our side, unfortunately, we can't do anything. Except for giving you a refund, but we’re sure you would prefer it to work 🙌

Usually, this error message indicates an SSL connection error, which typically occurs when your server has an outdated cURL package or SSL protocol. It's also possible that there's a firewall on your server. Your web host provider should have more information about this issue.