Kindly GPT with External Data

if you wish to insert custom data into kindly gpt that is not reachable by default url scraping (for example intranet pages, authenticated pages, or data in formats you must transform yourself), you can use the external scraping integration api see connect > kindly gpt external integration note if you wish to scrape a normal url, we recommend using our out of the box url scraper in kindly instead see connect > kindly gpt scrape web content overview the end to end flow is you set up a web server (your external scraper) kindly sends a pull trigger request to your external scraper your external scraper prepares data and uploads it to kindly when ready sequencediagram participant kindlyserver as kindly server participant externalscraper as external scraper (webserver) participant dataingress as data ingress endpoint note over externalscraper setup the external scraper webserver kindlyserver >>externalscraper daily call (trigger) externalscraper >>kindlyserver send ok response externalscraper >>externalscraper prepare data externalscraper >>dataingress send data when ready registering the external scraper in your workspace dashboard, go to connect > kindly gpt external integration read more location of external integration from there you can add your external scraper url how to setup external scrape integrations kindly will send a trigger event to your server daily the details of how to create your server is explained in incoming sections each bot market and language has separate external scraper integration setups all the markets and languages can point to the exact same url but each pull trigger only does 1 market/language combination at a time if you have multiple bots, they can use exact same url as well we send some bot identifiers with the pull trigger pull trigger what kindly sends to your server kindly sends a signed post request to your external scraper url curl x post h "kindly hmac {{hmac of the body}}" h "kindly hmac algorithm hmac sha 256 (base64 encoded)" h "kindly bot id {{your bot id}}" h "kindly config id {{external scraping url id}}" {{url to your scraper}} d @<(cat <\<eof { "bot id" "{{your bot id}}", "config id" "{{external scraping url id}}" } eof ) this json body must be sent back exactly as received when you upload the scraped content you need to implement hmac validation on your end hmac & kindly implementation of hmac if you never worked with hmac, we suggest you seek some external documentation on it we suggest the following okta has a good explanation of what is hmac and how to use https //www okta com/identity 101/hmac/ wikipedia has very good technical details https //en wikipedia org/wiki/hmac validate all signed requests using hmac sha256 over the exact raw body bytes in kindly we use hmac sha 256 (base64 encoded) rules do not normalize or reformat the body before verification use your workspace's webhook signing key base64 encode the resulting digest compare to kindly hmac to find your bot's hmac key you need to go to workspace dashboard, then settings > general > security > show key how to get hmac key of your bot responding to pull trigger when you receive the pull trigger, you should respond with a generic http status 200 response do not include scraped content in the response to the pull trigger preparing data for upload create a zip with content files it should have a collection of files in the root level files can only be raw markdown ( md), raw text ( txt), or html every other file types will be rejected upload contract send a multipart/form data post request to datastore kindly ai form parts file zip archive json json payload max request size is 5 mb (5,242,880 bytes) upload json body send back exactly what you received in the pull trigger { body you received from pull trigger } the json body must match the body you received from the pull trigger upload json body (with sources) optionally, you can add a custom field to the body filename to url , which allows for link to source this is a mapping from filename to urls so our ai agent can show the references whenever kindly gpt is used if you don't provide this mapping, it will still use kindly gpt but won't show the references here is an example body with references json { "bot id" "{{your bot id}}", "config id" "{{external scraping url id}}", "filename to url" { "file1 html" "https //not open to public but bot users has access to/file1 html", "file2 html" "https //same with file1/helpdesk html", } } upload headers kindly hmac hmac over the full raw multipart request body kindly bot id bot id (recommended and still supported) content length is required and must be a valid numeric value kindly accepts uploads up to 5 mb (5,242,880 bytes) set this header to the exact request body size you send example scraper see an example of an external integration written in python https //github com/kindly ai/external scraper example python https //github com/kindly ai/external scraper example python troubleshoot 403 hmac does not match common causes request body changed after signing json/body reformatted before signing wrong bot signing key 403 config id does not belong to this bot your upload `bot id` and `config id` do not match kindly's stored configuration 400 unsupported file types detected your zip contains file extensions outside md , txt , html 411 content length header is required and must be valid include a valid numeric content length header advanced push mode push mode is supported you can upload without waiting for a pull trigger, but you must still provide valid identity pairing