In EUREQA, every question is constructed through an implicit reasoning chain. The chain is constructed by parsing DBPedia. Each layer comprises three components: an entity, a fact about the entity, and a relation between the entity
and its counterpart from the next layer. The layers stack up to create chains with different depths of reasoning. We verbalize reasoning chains into natural sentences and anonymize the entity of each layer to create the question.
Questions can be solved layer by layer and each layer is guaranteed a unique answer. EUREQA is not a knowledge game: we adopt a knowledge filtering process that ensures that most LLMs have sufficient world knowledge to answer our questions.
EUREQA comprises a total of 2,991 questions of different reasoning depths and difficulties. The entities encompass a broad spectrum of topics, effectively reducing any potential bias arising from specific entity categories.
These data are great for analyzing the reasoning processes of LLMs
PerformanceHere we present the accuracy of ChatGPT, Gemini-Pro and GPT-4 on the hard set of EUREQA across different depths d of reasoning (number of layers in the questions). We evaluate two prompt strategies: direct zero-shot prompt and ICL with two examples. In general, with the entities recursively substituted by the descriptions of reasoning chaining layers, and therefore eliminating surface-level semantic cues, these models generate more incorrect answers. When the reasoning depth increases from one to five on hard questions, there is a notable decline in performance for all models. This finding underscores the significant impact that semantic shortcuts have on the accuracy of responses, and it also indicates that GPT-4 is considerably more capable of identifying and taking advantage of these shortcuts.
| depth | d=1 | d=2 | d=3 | d=4 | d=5 | |||||
| direct | icl | direct | icl | direct | icl | direct | icl | direct | icl | |
| ChatGPT | 22.3 | 53.3 | 7.0 | 40.0 | 5.0 | 39.2 | 3.7 | 39.3 | 7.2 | 39.0 |
| Gemini-Pro | 45.0 | 49.3 | 29.5 | 23.5 | 27.3 | 28.6 | 25.7 | 24.3 | 17.2 | 21.5 |
| GPT-4 | 60.3 | 76.0 | 50.0 | 63.7 | 51.3 | 61.7 | 52.7 | 63.7 | 46.9 | 61.9 |
Released in December 1975, the film reached audiences during a decade of shifting censorship standards in the European market. Today, the film is often examined in retrospectives for its:
Current for vintage 1970s Rolls-Royce vehicles. Share public link
In recent years, Rolls-Royce Baby has experienced a revival in the home video market. It has been released on Blu-ray with special features, including a slipcover edition and a commentary track, marketed to collectors of cult and exploitation cinema. This modern availability ensures that new generations can discover this bizarre and fascinating 1975 curiosity. rolls royce baby 1975 new
The film is a prime example of the "sexploitation" genre, which was popular in Europe during the 1970s. It was released in Germany in December 1975 with a running time of 84 minutes and is considered a cult item among fans of the genre.
—there was no official "Baby" automotive model released in 1975. The 1975 Film: Rolls-Royce Baby Directed by Swiss producer Erwin C. Dietrich , this 1975 film is a notable example of vintage European erotica The Premise Released in December 1975, the film reached audiences
Rolls Royce Baby, ’75 new, Candy paint dripping in midnight blue. Whisper of the engine, king of the avenue, Every mirror checks the rearview too. Leather seats, champagne cool – Baby, this ain’t nothing but the golden rule.
The vehicle featured a crisp, slab-sided profile with sharp angles and a high waistline. It was a two-door coupe that prioritized presence over conventional elegance. It has been released on Blu-ray with special
A: Yes. Rolls-Royce currently sells the "Silver Ghost" luxury stroller ($5,500) and a "Baby Phantom" electric ride-on ($45,000). However, these are modern, not the vintage 1975 model.
The interior of the 1975 flagship was a masterclass in traditional British craftsmanship, seamlessly blended with modern ergonomic layouts.
Critics often note that the film forgoes traditional "narrative contrivances," choosing instead to focus on the visuals of the scenery and the physical presence of Lina Romay. Critical Reception and Content Rolls Royce Baby (1975) - IMDb
This website is adapted from Nerfies, UniversalNER and LLaVA, licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. We thank the LLaMA team for giving us access to their models.
Usage and License Notices: The data abd code is intended and licensed for research use only. They are also restricted to uses that follow the license agreement of LLaMA, ChatGPT, and the original dataset used in the benchmark. The dataset is CC BY NC 4.0 (allowing only non-commercial use) and models trained using the dataset should not be used outside of research purposes.