Common Object Detection

Visual Detection Overview

Visual detection models localize an object of interest in an image by returning a box that bounds that object, as well as the type of that object, also referred to as the class. A detector can detect multiple objects of different classes per image. For each detection, a detector outputs a confidence score that is independent of any other detections.

The output object in Hive detection APIs lists each detected object, including:

  • The geometric description of the detected bounding box.
  • The predicted class for the detection.
  • For some model’s, the confidence score for the detection.

When submitting a video to be processed, Hive’s backend splits the video into frames, runs the model on each frame, then recombines the results into a combined response for the entire video. The video output for a detector is similar to a list of detection output objects, but with multiple timestamps.

870

Classes

  • wine glass
  • bottle
  • baseball glove
  • baseball bat
  • banana
  • backpack
  • apple
  • train
  • vase
  • umbrella
  • tv
  • truck
  • traffic light
  • toothbrush
  • toilet
  • tie
  • tennis racket
  • teddy bear
  • surfboard
  • suitcase
  • stop sign
  • spoon
  • skis
  • skateboard
  • sink
  • remote
  • sheep
  • scissors
  • refrigerator
  • potted plant
  • pizza
  • person
  • parking meter
  • oven
  • mouse
  • motorcycle
  • microwave
  • laptop
  • knife
  • kite
  • keyboard
  • hot dog
  • horse
  • handbag
  • hair drier
  • frisbee
  • fork
  • fire hydrant
  • donut
  • elephant
  • dog
  • dining table
  • cup
  • cow
  • couch
  • clock
  • chair
  • cell phone
  • cat
  • carrot
  • car
  • cake
  • bus
  • broccoli
  • bowl
  • book
  • boat
  • bird
  • bicycle
  • bench
  • bed
  • airplane
  • bear
  • giraffe
  • orange
  • sandwich
  • snowboard
  • toaster
  • zebra
  • sports ball

Supported File Types

Image Formats:
gif
jpg
png
webp

Video Formats:
mp4
webm
avi
mkv
wmv
mov


What’s Next

See the API reference for more details on the API interface and response format.