Hateful Memes Challenge

The Hateful Memes Challenge comprises a set of open source datasets around which competitions have been (or are being) held. It is designed to measure progress in multimodal vision-and-language reasoning and understanding.

Read more

("mean" memes for illustrative purposes; not actual examples, which are hateful and would be distasteful to show here)

About

Papers, splits and other details.

In order for AI to become a more effective tool for detecting hate speech, it must be able to understand content the way people do: holistically. When viewing a meme, for example, we don’t think about the words and photo independently of each other; we understand the combined meaning together. This is extremely challenging for machines, however, because it means they can’t just analyze the text and the image separately. They must combine these different modalities and understand how the meaning changes when they are presented together. To catalyze research in this area, Facebook AI has created a dataset to help build systems that better understand multimodal hate speech. We released this Hateful Memes dataset to facilitate research in true multimodal reasoning and understanding, and organized a competition around the dataset at NeurIPS 2020. Read more in the blog post announcement.

The Hateful Memes Challenge was organized over two phases: the "seen" phase 1, which is described in the NeurIPS paper, and the "unseen" phase 2, which was used to determine the prize winners and is described in the NeurIPS competition report. In addition, we've released a finer-grained set of annotations, which is the subject of a shared task.

The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes

NeurIPS 2020

This work proposes a new challenge set for multimodal classification, focusing on detecting hate speech in multimodal memes. It is constructed such that unimodal models struggle and only multimodal models can succeed: difficult examples ("benign confounders") are added to the dataset to make it hard to rely on unimodal signals. The task requires subtle reasoning, yet is straightforward to evaluate as a binary classification problem. We provide baseline performance numbers for unimodal models, as well as for multimodal models with various degrees of sophistication. We find that state-of-the-art methods perform poorly compared to humans, illustrating the difficulty of the task and highlighting the challenge that this important problem poses to the community.

The Hateful Memes Challenge: Competition Report (coming soon)

JMLR Special Issue on NeurIPS Competitions 2021

Machine learning and artificial intelligence play an ever more crucial role in mitigating important societal problems, such as the prevalence of hate speech. We describe the Hateful Memes Challenge competition, held at NeurIPS 2020, focusing on multimodal hate speech. The aim of the competition is to facilitate further research into multimodal reasoning and understanding.

The Finer-Grained Hateful Memes Challenge: Shared Task

Workshop on Online Abuse and Hate (WOAH) at ACL 2021

Read more about the shared task here.

To get started with your own research project on this dataset, check out the starter kit in MMF.

Download

Getting the train, dev and test data.

In order to download the dataset, you need to indicate agreement to the dataset license below.

Hateful Memes Dataset License Agreement

In order to access the Hateful Memes Dataset (as defined below), you (as defined below) must first agree to this Hateful Memes Dataset (“HM Dataset”) License Agreement (“Agreement”). You may not use the HM Dataset if you do not accept this Agreement. By clicking to accept, accessing the HM Dataset, or both, you hereby agree to the terms of the Agreement. If you are agreeing to be bound by the Agreement on behalf of your employer or other entity, you represent and warrant to Facebook that you have full legal authority to bind your employer or such entity to this Agreement. If you do not have the requisite authority, you may not accept the Agreement or access the HM Dataset on behalf of your employer or other entity.

This Agreement is effective upon the earlier of the date that you first access the HM Dataset or accept this Agreement (“Effective Date”), and is entered into by and between Facebook, Inc. (“Facebook”), and you, or your employer or other entity (if you are entering into this agreement on behalf of your employer or other entity) (“Participant” or “you”).

(1) Subject to Participant’s compliance with the terms and conditions of this Agreement, Facebook hereby grants to Participant, a limited, non-exclusive, non-transferable, non-sublicensable license to: (a) use the HM Dataset to research, develop and improve software, algorithms, machine learning models, techniques and technologies designed to detect manipulated media, images, audio and videos (the “Purpose”) and (b) distribute and reproduce up to a total of one hundred (100) images from the HM Dataset per Participant for research or academic publications related to the Purpose. If you include images from the HM Dataset in a research or academic publications, then you shall include attribution to Getty Images in one of the following formats:

  • “Image above is a compilation of assets, including ©Getty Images/[photographer name]”. If no photographer name is listed, attribution shall be as follows:
  • “Image above is a compilation of assets, including ©Getty Images/[collection name]”. If no collection name is listed, attribution shall be as follows:
  • “©Getty Images”.

(2) Subject to Participant’s compliance with the terms and conditions of this Agreement, Participant retains its intellectual property rights in and to all algorithms, software, machine learning models, techniques and technologies developed or otherwise derived by Participant from the use of the HM Dataset. Such algorithms, software, machine learning models, techniques and technologies may be used for academic and commercial purposes.

(3) As between Facebook and Participant, Facebook retains all intellectual property rights in and to the HM Dataset. All rights not expressly granted under this Agreement by Facebook are reserved.

(4) At any time, Facebook may require Participant to delete all copies of the HM Dataset (in whole or in part) in Participant’s possession and control. Participant will promptly comply with any and all such requests. Upon Facebook’s request, Participant shall provide Facebook with written confirmation of Participant’s compliance with such requirement.

(5) If Facebook reasonably believes (as determined at Facebook’s sole discretion) that you are or are likely to be in violation of the terms of this Agreement, then Facebook or Facebook’s designee (at Facebook’s sole expense) may audit your use, storage and distribution of the HM Dataset, including, without limitation, any and all records, files associated with the HM Dataset, and this Agreement. You hereby agree to cooperate with such audit.

(6) Participant will not:

  • modify, translate, or create any derivative works based upon the HM Dataset;
  • distribute, copy, disclose, assign, sublicense, embed, host or otherwise transfer the HM Dataset to any third party, except as described in Section 1(b) above;
  • remove or alter any copyright, trademark or other proprietary notices appearing on or in copies of the HM Dataset;
  • use the HM Dataset in a pornographic, defamatory or other unlawful manner, or in violation of any applicable regulations or laws;
  • incorporate the HM Dataset into any other program, dataset, or product;
  • use the HM Dataset to distribute manipulated images or videos (except as described in Section 1(b) above); or
  • use the HM Dataset for any purpose other than the Purpose specified in this Agreement.

(7) If you use the HM Dataset (or any portion thereof) in a manner that features models or property in connection with a subject that would be unflattering or unduly controversial to a reasonable person, you must indicate: (1) that the content is being used for illustrative purposes only, and (2) any person depicted in the content is a model. For example, you could say: “Stock photo. Posed by model.”

(8) Facebook always appreciates your feedback and other suggestions about the HM Dataset. However, you should know and you hereby agree that we may use your feedback and suggestions without any restriction or obligation, including, without limitation, to compensate you or to keep them confidential.

(9) Upon the termination of this Agreement, Participant will immediately stop using the HM Dataset and destroy all copies of the HM Dataset and related materials in Participant’s possession and control. Additionally, Facebook may, at any time, for any reason or for no reason, terminate this Agreement, effective immediately upon notice to the Participant. Upon termination, the license granted to Participant hereunder will immediately terminate and Participant will immediately stop using the HM Dataset and destroy all copies of the HM Dataset and related materials in Participant’s possession or control. Except for the licenses granted to Participant, the other provisions of this Agreement will survive any termination.

(10) THE HM DATASET IS PROVIDED “AS IS” WITHOUT ANY EXPRESS OR IMPLIED WARRANTY OF ANY KIND, INCLUDING WARRANTIES OF MERCHANTABILITY, TITLE, NON-INFRINGEMENT, OR FITNESS FOR ANY PARTICULAR PURPOSE.

(11) IN NO EVENT WILL FACEBOOK, ITS CONTRACTORS AND ITS LICENSORS BE LIABLE FOR ANY CONSEQUENTIAL, INCIDENTAL, EXEMPLARY, PUNITIVE, SPECIAL, OR INDIRECT DAMAGES (INCLUDING DAMAGES FOR LOSS OF PROFITS, BUSINESS INTERRUPTION, OR LOSS OF INFORMATION) ARISING OUT OF OR RELATING TO THIS AGREEMENT OR ITS SUBJECT MATTER, EVEN IF FACEBOOK HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

(12) FACEBOOK, ITS LICENSORS AND ITS CONTRACTOR’S TOTAL LIABILITY ARISING FROM OR RELATING TO THIS AGREEMENT AND ITS SUBJECT MATTER WILL NOT EXCEED ONE HUNDRED DOLLARS ($100).

(13) Either party may terminate this Agreement if the other is in material breach of this Agreement and such breach remains uncured for thirty (30) days following receipt of written notice of the breach.

(14) From time to time, Facebook may provide your name and the fact that you are a licensee of the HM Dataset to our licensors.

(15) Participant will comply with all applicable export controls, import controls and trade sanctions applicable to the HM Dataset. You shall obtain, at your sole cost and expense, any export and import (temporary and permanent) license and other official authorization applicable to the HM Dataset. This Agreement, and your relationship with Facebook under this Agreement, shall be governed by the laws of the State of California without regard to its conflict of laws provisions. You and Facebook agree to submit to the exclusive jurisdiction of the courts located within the county of Santa Clara, California to resolve any legal matter arising from the Agreement. Notwithstanding this, you agree that Facebook shall still be allowed to apply for injunctive remedies (or an equivalent type of urgent legal relief) in any jurisdiction. Facebook may make changes to this Agreement at any time with notice to Participant and the opportunity to decline further use of the HM Dataset. You should look at the Agreement and check for notice of any changes regularly. Changes will not be retroactive. They will become effective, and will be deemed accepted by Participant, (a) immediately for those who become Participants after the notification is posted; or (b) for pre-existing Participants, on the date specified in the notice, which will be no sooner than 30 days after the changes are posted (except changes required by law which will be effective immediately). If You do not agree with the modifications to the Agreement, You may terminate Your use of HM Dataset, which will be Your sole and exclusive remedy. You agree that Your continued use of HM Dataset constitutes Your agreement to the modified terms of this Agreement. No failure to exercise and no delay in exercising any right, remedy or power hereunder will operate as a waiver thereof, nor will any single or partial exercise of any right, remedy or power hereunder preclude any other or further exercise thereof or the exercise of any other right, remedy or power provided herein or by law or in equity. Participant may not assign its rights and obligations hereunder without prior written consent of Facebook. Facebook may assign its rights and obligations hereunder at any time to any party without Participant’s consent. If any provision of this Agreement is found by a court of competent jurisdiction to be void, invalid or unenforceable, the same will be reformed to comply with applicable law or stricken if not so conformable, so as not to affect the validity or enforceability of the remainder of this Agreement. This Agreement constitutes the entire agreement between the parties concerning the subject matter hereof and supersedes all prior or contemporaneous representations, discussions, negotiations, conditions, and agreements between the parties relating to the subject matter hereof.

Fill out the form below and press "Download" to download the dataset.

Please provide a name.
Please provide a name.
Type "N/A" if you do not have an affiliation.
Please provide a name.
Please agree to the dataset license agreement.

Leaderboard

Results on the seen and unseen test set.

If you want to add your paper to the leaderboard, please contact us and provide the relevant information.

# Publication Seen AUROC (Acc.) Unseen AUROC (Acc.) Code

* = ensemble. The old DrivenData leaderboards used for the competition are here for Phase 1 and here for Phase 2.

Contact

Any questions? Feel free to reach out!

You can reach us via email.