Image Indexing through FTP Upload
What is FTP file upload?
You would need to upload your images and other associated data into our index, so you can search this index visually using our solutions. Currently we supports 3 methods of data upload to our service.
- Upload CSV file on Dashboard - intended for one time upload and quick testing.
- ViSenze Data API - intended for near real-time upload.
- FTP file upload - to be described in this guide - intended for a regular sync but that is less real-time (ex: once a day).
When should I consider using FTP upload?
- Wish to use a simple daily batch sync process through files.
- Don’t want to integrate with ViSenze’s Data API (due to complexity).
- Are OK with an index update of once or twice a day.
- This is intended to be an asynchronous batch process. It’s simple to use, but it also means your index is updated only once a day. For some customers, this might not be sufficient.
- If a customer’s database changes very quickly (fast moving fashion, flash sales, very quick updates to your database every minute etc.) you should consider using ViSenze Data API.
- If you just need to do a one time upload of your database, consider using our Dashboard CSV file upload.
- FTP Upload is not recommended for trial users.
FTP upload procedure
- Customer will need to first create a FTP account with ViSenze. Each customer will get a username and login to ViSenze’s FTP servers. To do so, please contact your account manager or write to email@example.com.
- Connect to the FTP account and upload the folder in the default folder of the account.
- Server name : [your ftp server]
- Username : [your ftp username]
- Password : [your ftp password]
- Directory: the default folder after login
- Upload the data file as a gzip compressed csv file.
- Upload an empty .txt file of the same format.
Please note : without the empty txt file we won’t process the uploaded file automatically. The data file and empty txt file should follow the same naming convention described below.
Tips on generating data file
Typically you will need to write a server side script that does the following on your end.
- Query your database to get the data that needs to be indexed with ViSenze.
- Save the data in a CSV file format.
- Compress the file using gzip.
- Upload the zipped file to ViSenze’s FTP servers.
Processing the uploaded file
- The ViSenze batch indexing service will check your FTP folder once every 5 minutes for any new .txt files uploaded.
- When a new .txt file is detected, the service will fetch the respective data file within 5 minutes (provided the uploaded file is still there).
- Batch indexing service will start processing the file and update your index using the following rules.
- Each unique data row is identified using im_name (which is our identifier for an image and it’s associated data).
im_namein the file and also in the index, process will update the data entity.
im_namein the file and not in the index, process will insert this new data entity.
im_namein the index but not in the file, process will delete the entity from our index.
- At the end of this process, your ViSenze index should mirror the data in the uploaded file.
- Once the batch indexing service has finished processing the file, we will move the file to archive folder.
- Our batch indexing service will also check the indexing status of the uploaded data and save the progress to a specific file under the same ftp folder.
- When the indexing is completed a status file is created in the same ftp folder with the results.
Data file requirements
File name format :
- Please gzip your datafeed file for upload.The extension of the datafeed file should be: .csv.gz
- Your admin access key should be the filename. For Example: a22cd4ebc4a06c82fe71fdsssaa40c.csv.gz
- We support two process modes:
- Default mode: the contents of this file will be used to fully replace your current image database. For example: a22cd4ebc4a06c82fe71fdsssaa40c.csv.gz
- INCREMENT mode: the contents of this file will be used to append new entries or update existing entries. For example: a22cd4ebc4a06c82fe71fdsssaa40c_INCREMENT.csv.gz
- Send a flag file name as a22cd4ebc4a06c82fe71fdsssaa40c.csv.txt to signal the end of upload,we begin processing your data only after this file is received.
- Do not insert any other underscores or special characters into the file name.
File data format
- The data in the file must be in comma separated format specified in our documentation.
- A header row with the field names should be included. For the list of fields, please check the schema on your ViSenze Dashboard.
im_urlare required fields. If a row has invalid or missing
im_url, we’ll skip processing such rows.
- You need to use comma
“,”to separate the columns.
- The csv datafeed must be encoded as UTF-8.
- Since the data file could be large, we require compressing the file using gzip.
- Only .gz extension is supported for now. Other extensions such as .zip, .gzip etc are not supported.
Empty txt file requirements
- An empty txt file should be uploaded once your data file upload is complete.
- Format for the .txt file should be the same as data file except for the extension part.
please note: Automatic indexing will not be triggered if we fail to detect a new txt file that accompanies the data file.
Status File Format
User can wait and check the FTP folder to retrieve a status file that contains information on the indexing progress. Below is an example of the status file content.