PETR

Position Embedding Framework

Develops a framework for multi-view 3D object detection and perception from camera images using position embedding transformation.

[ECCV2022] PETR: Position Embedding Transformation for Multi-View 3D Object Detection & [ICCV2023] PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images

GitHub

881 stars
15 watching
132 forks
Language: Python
last commit: over 1 year ago
Linked from 1 awesome list

3d-position-embeddingmulti-cameramulti-task-learningobject-detectionsegmentation

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
megvii-research/tlc Improves image restoration performance by converting global operations to local ones during inference 231
vita-epfl/monoloco A software framework for 3D vision and computer vision tasks using deep learning and 2D keypoints. 431
gink03/alt-i2v An implementation of a deep learning-based image representation learning approach using a modified fully connected layer and transfer learning from VGG16 34
gamrix/cs231n_proj This project focuses on manipulating 3D views using deep learning techniques. 6
plasticityai/magnitude A fast and efficient utility package for utilizing vector embeddings in machine learning models 1,635
jshilong/gpt4roi Training and deploying large language models on computer vision tasks using region-of-interest inputs 517
appinho/sarosperceptionkitti ROS package for perception processing and evaluation in the KITTI Vision Benchmark Suite 247
icetttb/planetr3d An implementation of a deep learning-based method for 3D plane recovery from images. 94
vmarsocci/3dcd Automatically inferring 2D and 3D change detection maps from bitemporal optical images without relying on DSMs. 29
esri/pyprt Python bindings for CityEngine's procedural runtime for generating 3D models 64
megviirobot/camlasercalibratool Automates extrinsic calibration of cameras and 2D lasers in robotics using ROS 662
petworm/larvio An implementation of a monocular visual inertial odometry algorithm based on Multi-State Constraint Kalman Filter for accurate and robust localization 737
vita-epfl/crowdnav Develops robot navigation policies in crowded spaces using reinforcement learning and attention mechanisms. 607
lavi-lab/visual-table A project that generates visual representations tailored for general visual reasoning, leveraging hierarchical scene descriptions and instance-level world knowledge. 14
yunishi3/3d-fcr-alphagan This project aims to develop a generative model for 3D multi-object scenes using a novel network architecture inspired by auto-encoding and generative adversarial networks. 103