Documentation


The underlying source of the data is the DG COMP's public database of all merger cases. The collection of the case information and of all relevant decisions was done through an automated web scraper. To map the citations relationships between the decisions, a python parser was developed that automatically extracts all citation-related information from the Decisions' pdf files. The advantage of this approach is that Merger cite can map the full merger landscape, the disadvantage of this approach is that a margin of error might exist and some citations between cases can be missing as a results of this automated procedure. Random manual checks have yielded a high precision rate of the automated retrieval procedure both in terms of identity of the citation pairs as well as in terms of the number of occurrences of these citations pairs.

The focus of Merger cite is restricted to merger cases, where a publicly available legal document exists in Phase I proceedings (Art. 6(1)(a), 6(1)(b) and/or 6(2)) or Phase II proceedings ((Art. 8(1), 8(2), and 8(3)). Decisions relating to other articles of the competition law fall outside the scope of this project and are not anyhow considered in the citation landscape.

The Search tool servers to explore the citation landscape. A full universe of merger cases can be seen by leaving the search field empty. A particular case can be selected by entering its case number of a name of the involved company. Results are clustered in three tables:

  • Case info: gives a basic information for every case where a legal document exists. The fields include the case number, case name (with link to the decision), dates of notification and decision's publication, legal article governing the decision, phase of proceedings, NACE code, and indicator for decisions in English. Please note that some older decisions adopted under the regulation 4064/89 are not present in the DG COMP's merger database and therefore not included in Merger cite.

  • Backward citations: this database can be used to search previous decisions ('cited') mentioned in the body of the searched decision ('citing'). Please note that the table is empty if no backward references were made in the searched decision.

  • Forward citations: this database can be used to search consecutive decisions ('citing') that referenced the searched decision ('cited'). Please note that the table is empty if no forward references were made to the searched decision.

Backward and Forward databases contain several variables. Their description is as follows:

  • Citing case: a unique number of the merger case.

  • Citing name: name of the merger case (identity of the involved companies).

  • Citing notification: the notification date of the citing case.

  • Cited case: a unique number of the merger case.

  • Cited name: name of the merger case (identity of the involved companies).

  • Cited notification: the notification date of the cited case.

  • Citations: number of occurrences of the citing-cited pair.

  • Total citations: total number of citations by the citing case (backward database) or in the cited case (forward database).

  • Same NACE: indicates whether the citing-cited pair share at least one NACE sector code (eg. C21, J61)

The Visualization tool servers to visualize the citation landscape of a case in a network framework. For each case, this tool takes all its citations (backward and forward) and plots the relationship between the citation. Hence, this allows to see the position of a specific case between its citations but also how dense the citation network is.

The collection of data still continues and the webpage will be updated in irregular intervals as new decisions become available.

Legal note: The software is provided "as is", without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and non-infringement. In no event shall the author or copyright holders be liable for any claim, damages or other liability, whether in an action of contract, tort or otherwise, arising from, out of or in connection with the software or the use or other dealings in the software.