extract_similarity_dataset
The job extracts highlights pairs from each Video or Sound file that can be used to calibrate an archive similarity parameters.
It's useful when you do not want to manually build your similarity calibration dataset, and your data consists of videos or sounds, allowing two highlights to be sampled from the timeline.
Required Account Privileges: "read"
Request JSON ["inputs"]:
"source_content_type": string in ["Video", "Sound"] null NOT allowed A string specifying the type of source content. "content_type": string in ["Video", "Image", "Sound"] null NOT allowed A string specifying the type of output content. "max_nr_of_pairs": int (>=1) null NOT allowed The total number of pairs to export from all the files. If the number is smaller than the total nr of files then the most diverse pairs will be prioritized. "custom_vectorizer_name": string (3 <= len <= 30) null allowed An optional string representing the name of the custom vectorizer to be used for sampling. "file_urls": list of strings null allowed An optional list of strings containing the URLs of files to be downloaded. "download_from_batch_cloud_folder": bool null NOT allowed A boolean indicating whether to download files from the batch cloud folder.
Response JSON ["results"]
"similarity_calibration_pairs_download_urls": list of strings public_url (string) file_names from the same pair share a prefix with the structure "<pair_id>_cluster"
File Requirements
Requires files to be sent via FTP to the cloud batch folder or in the file_urls