CARMA: Context-Aware Situational Grounding Combining Vision-Language Models with Object and Action Recognition
The framework is set up for Unix systems.
To use GPT4, you need to set
export OPENAI_API_KEY="53CRE7_KEY"git clone https://github.com/HRI-EU/carma.git
cd carma# Create virtual environment
python -m venv venv
# activate virtual environment
source venv/bin/activate # Linux
call venv\Scripts\activate # Windows
# Install dependencies
pip install -e .# activate virtual environment
source venv/bin/activate # Linux
call venv\Scripts\activate # Windows
python -m examples.carmaThe default experiment is Sorting Fruits including one person, according to the paper.
You can change the system configuration modifying the main.py here:
https://github.com/HRI-EU/carma/blob/main/examples/carma.py#L302
The first boolean value switches on/off the usage of the action label, the second boolean controls the action trigger and
the last boolean allows to use the previous triplet in the prompt or not.
Same as for running carma, you can evaluate the different runs, using the boolean set again at:
https://github.com/HRI-EU/carma/blob/main/src/evaluation/evaluate_carma.py#L377
The default experiment is again Sorting Fruits including one person.
# run evaluation
python -m evaluation.evaluate_carma