How to convert the JSON files exported by Supervisely annotation tool to the Darknet format in python? GitHub repo: JinhangZhu/supervisely-to-darknet
JSON
JSON (JavaScript Object Notation) is a popular data format for storing structured data. I don't care what is could be done and other stuff about it. What I need to do is to make this format converted to the format that my YOLOv3 model can take. But first thing first, let's see what format JSON has.
Syntax rules
- Data is in name/value pairs.
- Data is separated by commas
,
. - Curly braces
{}
hold objects. - Square brackets
[]
hold arrays.
Components
Name/value pairs. A data pair consist of a field name in double quotes
""
, followed by a colon:
, followed by a value:"classTitle": "left_hand"
Objects. Objects are written inside curly braces
{}
. There may be multiple name/value pairs inside one pair of curly braces, just like what dictionaries look like in Python.1
{"firstName":"John", "lastName":"Doe"}
Arrays. JSON arrays are written in square brackets
[]
. Like the list in Python, an array can contain objects.1 2 3 4 5
"employees":[ // The obejct "employees" is an array that contains three objects {"firstName":"John", "lastName":"Doe"}, {"firstName":"Anna", "lastName":"Smith"}, {"firstName":"Peter", "lastName":"Jones"} ]
Python JSON
In Python, JSON exists as a string. For example:
|
|
To work with JSON (string, or file containing JSON object), we use Python's json
module.
|
|
Parse JSON to dict
To parse a JSON string, we use json.loads()
method, which returns a dictionary. For example:
|
|
Read JSON file
Our Supervisely annotations are stored in JSON files, so we need to load the file first. For example, say a .json
file contains a JSON object:
|
|
And we parse the file:
|
|
Supervisely format
Supervisely json-based annotation format supports several figures including: rectangle
, line
, polygon
.... BUT we only care about the rectangular objects.
JSON for the whole project
Each project has predefined objects classes and tags. File meta.json
contains this information. Ours is as follows:
|
|
- "classes": list of objects - all possible object classes.
- "title": string - the unique identifier of a class - the name of the class.
- "shape": string - annotation shape.
- "color": string - hex color code (not important here)
- ...
JSON for an image
For each image we keep a json file with annotations.
|
|
"size": is equal to image size.
- "width": image width in pixels
- "height": image height in pixels
"objects": list of objects, contains fields about the annotated label rectangles with their values.
"classTitle": string - the name of the class. It is used to identify the class shape from the
meta.json
."points": object with two fields:
"exterior": list of two lists with two numbers (coordinates):
[[left, top], [right, bottom]]
"interior": always empty for rectangles.
Darknet format
Darknet format specifies not only the annotation format for each image but also the format of files in folders. We follow the format of COCO: images and labels are in separate parallel folders, and one label file per image (if no objects in image, o label file is required).
Label files
The label file specifications are:
One row per object
Each row is in the format of
class b_x_center b_y_center b_width b_height
.Box coordinates must be in normalised xywh format (from 0 to 1). Since Supervisely coordinates are in pixels, normalisation step is required on both x and y axes.
Say $x_{LT}, y_{LT}, x_{RB}, y_{RB}$ are respectively the elements in
[[left, top], [right, bottom]]
. $height, width$ are image sizes. Then normalisation is like: $$ \text{b_x_center} = \frac{x_{LT}+x_{RB}}{2\times width}\\
\text{b_y_center} = \frac{y_{LT}+y_{RB}}{2\times height}\\
\text{b_width} = \frac{x_{RB}-x_{LT}}{width}\\
\text{b_height} = \frac{y_{RB}-y_{LT}}{height} $$Class numbers are zero-indexed (start from 0).
For example, one-row label:
1 0.5841911764705883 0.535625 0.030147058823529412 0.04375
Each image's label file must be locatable by simply replacing /images/*.jpg
with /labels/*.txt
in its path name.
Data splitting
There should be a .txt
file that contains the locations of images of the dataset. Each row contains a path to an image, and remember one label must also exist in a corresponding /labels
folder for each image containing objects.
.names
file
The file lists all the class names in the dataset. Each row contains one class name.
.data
file
There should be class count (e.g. COCO has 80 and P30 has 2), paths to train and validation datasets (the .txt
files mentioned above), and the path to the .names
file.
Coding
Firstly, we create a sub folder called "./dataset/" (or something else), which will contains all our data generated.
|
|
Then, we need to know what classes the whole image set has. This information is easily found in meta.json
, so we import the json file and read the values of the key "classes". After reading, the class names are appended into the .names
file.
|
|
We create a function called conver_supervisely_json()
that performs making folders and obtaining classes before writing labels.
|
|
As Supervisely exports images and annotation files in separate folders img
and ann
, we use glob
to obtain the iterable paths of files withing two folders and then sort them.
|
|
It is now time to import the json files and read data from them. We will assign each image with at least one object a label file in .txt
in the labels folder. For each object bounding box, there should be class index (in integer numbers), normalised center coordinates and normalised size. We also copy the images to the images folder.
|
|
Run the function and we see 754 label files in the labels folder, as indicated by the Supervisely filter that the number of images with at least one object is correct.
As YOLOv3 requires a train set and a validation set in the form of collections of path names. We need to create two *.txt
files to contain separate sets of paths of images withing ./dataset/images
folder. This feature is supposed to be implemented by:
|
|
Dataset splitting is transformed to elements splitting, i.e. we randomly choose different sizes of collections of elements from the whole set of pathnames. By randomly choosing indices and then we can use the indices to select the separate sets of paths.
|
|
Split the paths into several sets according to whether the corresponding indices are empty or not:
|
|
Run the code:
|
|
and we'll see files withing the folder we specified:
|
|