179x Filetype PDF File size 2.49 MB Source: ailb-web.ing.unimore.it
FashionGraph understanding fashion data using scene graph generation Shabnam Sadegharmaki1; Marc A. Kastner2; Shin'ichi Satoh2 1Technical University of Munich, 2National Institute of Informatics, Tokyo Motivation Table 1. RelDN[2] on Visual Genome vs. FashionGraph on Fashionpedia Method Dataset R@20 R@50 R@100 • Brought the idea of scene graphs to fashion images. RelDN[2] Visual Genome 23 31 37 ↳helps in better understanding of fine-grained fashion data. FashionGraph Fashionpedia 18 22 24 • A model to generate fashion scene graphs. ↳using object and relationship detection models. Figure 1. Example of ground-truth annotations vs. predicted graph ↳we generated new annotations for this purpose. Ground-truth Annotations VS. FashionGraph(Proposed) • Integrated the attribute detection into the scene graph model. shirt (dress) • Highlighted the application of SG for fashion image retrieval. plain symmetrical high waist collar dress sleeve short (length) Data Annotation rivet(a) single breasted sleeve short (length) We provide relation detection FashionPedia SG Annotations trumpet bead annotations for Fashionpedia Dataset masks to bounding boxes: dress ruffle dataset[1]. Segmentations shirt (dress) symmetrical shirt (dress) Relationships collar symmetrical To train a SG generator, we object predicate subject shirt collar need the following annotations: hierarchy: sleeve short (length) sleeve short (length) • Fine-grained segmentation: extracted overlaps (iou > 0.9) sleeve short (length) sleeve short (length) belongs to only bounding boxes and object belt object only attribute labels Images color: • Relationships in the format Application: Image Retrieval of object, subject, predicate Attributes attributes: biker(jacket) • To rank the images for a given query image, we represent the predicted scene graph by four matrices: Relationships for a fashion data include: • Objects • Hierarchical. E.g. pocket belongs to shirt • Hierarchical relationships • Attributes. E.g. dress is A-line • Attributes • Color. E.g. Jacket is blue • Colors • Then we calculate the cosine Architecture similarity of each type between the images and the query. Detection Post-processing Figure 2. Qualitative evaluation on fashion image retrieval Object Detection: Filter detected relationships if: • × <ℎ Method Query Results • ResNeXt(101-64x4d-FPN) !"#$%& '"#$%& !"#$%& • <ℎ • Weights pre-trained on VG • ()*+,-&. < ℎ,-&. • Fine-tuned on Fashionpedia ()*+/&&) /&&) attr1 SG Detection: belongs is • RelDN[2] (ResNext) is attr2 Ground-truth • Trained on our rel. is (Baseline) annotations attr3 • subject bbox/score • relationships • object bbox/score • predicate scores + FashionGraph (Proposed) Contact References sadeghar@in.tum.de [1] Jia, Menglin, et al. "Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset." arXiv preprint arXiv:2004.12276 (2020). Code [2] Zhang, Ji, et al. "Graphical contrastive losses for scene graph parsing." Proceedings of the https://github.com/shabnamsadegh/FashionGraph IEEE Conference on Computer Vision and Pattern Recognition. 2019.
no reviews yet
Please Login to review.