Logo SFP

Data Challenge 2022-23 English

Data Challenge SFP 2022-23

The VisioMel project : search for a digital signature evaluating the risk of metastatic evolution of primary melanoma within 5 years following the initial diagnosis


This project supervised by the French Society of Pathology along with the French Society of Dermatology, the Cutaneous Cancer Group  (GCC) and the National Professional Council of Pathologists (CNPath), aims to organize an international data challenge in March-April 2023 about melanoma relapse. This event, organized in collaboration with the Health Data Hub (HDH) and with the support of the Public Investment Bank (BPI), is a worldwide competition whose objective is to solve a specific problem in an allotted time and using strongly anonymized data. Thus, this challenge is intended for data scientists (researcher, industrials, students etc.) from all around the world. Challengers will have to build an artificial intelligence (AI) algorithm able to predict melanoma relapse within 5 years after initial diagnosis. In a final step, the accessibility of the data and algorithms resulting from the data challenge is encouraged in order to allow research in the interest of all.


Melanoma is a cancer of the skin or, more rarely of the mucous membranes, which develops from melanocytes (cells responsible for skin pigmentation).
The causes of the disease are multifactorial but mainly depend on the interaction between UV exposure (period and intensity), host factors (presence of atypical nevi*, high number of nevi, skin phototype) and genetic factors.
In 2018, the National Cancer Institute estimates that 15,500 new cases of cutaneous melanoma were detected in France (7,900 men and 7,600 women). With 1,800 deaths that same year (1,040 men and 840 women), this cancer represents 1.2% of cancer deaths in France for all sexes combined. It is one of the cancers whose incidence* and mortality has significantly increased over the past decades.
These tumours represent around 10% of skin cancers but are the most serious because of their high metastatic potential. Development of metastases is a factor of poor prognosis*. This means that cancerous cells located in the primary tumor colonize neighboring healthy tissues leading to the formation of secondary tumors in the lymph nodes (called loco-regional melanoma) or in other organs (called distant melanoma). At the time of initial diagnosis, metastases are rarely observed. Indeed, they generally appear during the follow-up of the disease.
The diagnosis of melanoma is made by microscopic analysis of the tumor tissue by a pathologist. From a colored histological slide, the pathologist establishes the final diagnosis of the disease and determines the severity of the lesions according to prognostic factors* (size of the tumor, presence of ulceration, mitosis rate, etc.). These prognostic factors* are then synthesized into a stage associated with cancer according to the AJCC classification.
This analysis combined with clinical prognosis factors (age, sex, medical history of the patient etc.) allows the dermatologist to adapt the treatment to the severity of the disease.
Patient’s survival* essentially depends on the stage of their cancer at the time of diagnosis. Regarding primary cutaneous melanoma without metastasis, the prognosis is mainly related to the thickness of the melanoma. Thus, at an early stage (thin melanoma less than 1 mm thick) the 5-year survival is estimated at more than 95%. Thicker melanomas (over 4 mm) have a 50% risk of relapse within 5 years. If the melanoma is metastatic at the time of diagnosis or if it has relapsed, additional surgical (lymph node dissection, excision of metastases) or medical (immunotherapy, targeted therapy) treatments can then be proposed.

Questions explored through VisioMel project

1st Question : Although thin melanomas (less than 1 mm thick) are associated with a good prognosis, they are responsible for a significant and still misunderstood proportion of relapses and deaths. In the same way, for intermediate thickness melanomas (between 1 and 4 mm) with a higher risk of relapse, there are no predictive factors for this possible metastatic evolution. Adjuvant* treatments now exist to limit this risk for some operable melanomas assessed as high risk of relapse. However, beyond their high cost, these treatments also expose patients to significant drug toxicities. This is why it is becoming urgent to be able to distinguish patients who, without adjuvant treatment, do not relapse, in order to target only patients who can get clinical benefits from these therapies. These treatments could also, in the future, be considered in a neoadjuvant setting*.
The search for new predictive markers of relapse for primary non-metastatic melanomas by artificial intelligence would make it possible to complete the analysis of the pathologist and thus adapt the care of the patient. In addition, this identification for thin melanoma, whose recurrence is particularly complex to predict, would constitute a major step forward in melanomas’ care.
2nd Question : Alongside, the determination of the mutational status of the tumor (especially regarding the BRAF V600E gene) is essential for the prescription of a targeted therapy. The presence or absence of such a mutation also makes it possible to distinguish between different types of melanoma which may have distinct clinical evolutions. The search for such a mutation currently requires complementary techniques that can be costly. As a result, these techniques are currently only requested for advanced stage lesions. Data on this mutational status will be made available in the post-challenge database to enable future research on prediction of BRAF status. This question will not be addressed during the challenge.
AI approaches are particularly relevant for the creation of a tool at the service of doctors in order to support them in the rapid and precise detection of potential cases of relapse. Indeed, it is essential to identify new prognostic factors that the clinical examination or the histological examination might not perceive. Approaches using AI have already been used. However, they are based on quite small sample sizes and are using highly supervised methods. The size of the cohort considered here will allow the use of unsupervised methods. In addition, the coupling of clinical, histological and molecular variables would make it possible to go beyond the current segmentation of disciplines (clinical, histological and molecular) and to increase the chances of identifying prognostic "patterns".

Material and methods

General view of the project

Patients selection : As the problem is to predict a metastatic evolution of melanomas, only patients with localized disease at the time of diagnosis (stage 0 to IIC) are included in the study. Patients selection is made from the RIC-Mel database. Thanks to the efforts of a network of physicians from 49 French inclusion centers, this national database created in 2012 now collects data from around 40,000 patients with melanoma.
2,000 patients will be selected according to the following criteria:

  • Cancer stage between 0 and IIC,
  • Initial diagnosis between 2012 and 2016 (because the relapse is studied at 5 years following initial diagnosis) 

Selected variables for the challenge : The training step for the prediction of relapse at 5 years of non-metastatic primary melanomas by the AI algorithm would be done on the basis of clinical and histological data. The algorithm thus constructed will have to predict the recurrence of the tumor on the sole basis of the analysis of images, that is to say of histological data. Prediction of the B-RAF mutational status is a secondary goal that may necessitate a second data set, since only a subgroup of tumors was characterized molecularly.

In a first step and thanks to the effort of inclusion centers, the following clinical variables are updated and retrieved from the RIC-Mel database or from the patient medical reports:

  • age,
  • sex,
  • patient's medical history,
  • site of the melanoma (upper limb, lower limb, trunk, cephalic area),
  • Breslow, ulceration presence or not,
  • family history,
  • cancer progression/recurrence within 5 years (dates of events). 

In a second phase,the corresponding histological slides are de-archived and digitized after pseudonymization in order to be included in the challenge database. This task is done by the pathological anatomy and cytology (ACP) laboratories which carried out the analysis of the excision of the primary tumor.
Great care is taken to ensure the quality of the data by involving the inclusion centers and verifying the completeness of the data.

Anonymization : The data will be then strongly anonymized without possible return to the patient's name and stored on the Health Data Hub servers. A re-identification risk analysis will be made in collaboration with DrData.
Course of the challenge : The database will then be uploaded on the platform hosting the data challenge and will remain available for a 6 to 7 weeks period.
The data from the 2000 patients will be split into three different sets.

Each set will be built in such a way as to overcome potential biases due to exogenous factors: ACP laboratory (preparation and staining of slides), type of scanner, etc. Similarly, the stage of the cancer and the sex of the patient will be distributed in a balanced way in the different sets.
The goal will be to quantify the performance of the algorithm in the prediction of melanoma relapse (and mutation status, still under discussion).
The performance of the algorithms proposed by the challengers will be evaluated on simple, binary criteria (absence or presence of metastatic evolution at 5 years). The error between the prediction and the “ground truth” will be weighted by the seriousness of that error using a metric. This mathematical weighting will have clinical meaning and will be communicated at the time of the competition.

Regulatory framework of the project

All stages of the project have been supervised and validated by DrData. This consulting structure specializes in the protection of personal data in the health field. Their teams of experts support hospitals, healthcare professionals and digital companies (artificial intelligence, telemedicine, etc.) in their GDPR compliance and the privacy by design of all their processes and projects.


the project is financially supported by the BPIFrance (BPI) as well as by donations from Bristol Myers Squibb (BMS) and Pierre Fabre.


Source: Cancer Foundation

Excision : Surgical procedure consisting in removing from the body, and if possible in its entirety, an element that is harmful or useless to it.
Prognostic factor : Situation, state or characteristic of a person that is considered when establishing a prognosis. There are many different prognostic factors, including the type and stage of the cancer as well as the age and overall health of the person affected.
Incidence : Total number of new cases of a disease diagnosed in a given population during a specified period of time.
Nevus : Beauty spot/mole. It is a flat or raised spot that corresponds to a cluster of skin cells: melanocytes.
Prognosis : Expected outcome or course of a disease or chance of recovery or risk of recurrence.
Relapse : Cancer that comes back (recurs) after a period of time when the patient has had no signs or symptoms (remission). We speak of local recurrence when the cancer comes back in the same region of the body as the initial location (primary site) of the tumor. We speak of a distant recurrence when the cancer appears again in a region of the body other than the initial site (primary site) of the tumor.

Survival : The percentage of people with a disease who are still alive at some point after being diagnosed. Statistical data on cancer survival are often provided for a 5-year survival period. This data indicates the percentage of people with a particular disease who are still alive 5 years after being diagnosed. These may be people who do not have a recurrence, who are in remission or who are still receiving treatment.
Adjuvant therapy : treatment given in addition to first-line treatment (first treatment or standard treatment) to help reduce the risk of the disease coming back (recurring).
Neoadjuvant therapy: Neoadjuvant therapy is the administration of therapeutic agents before a main treatment.

Dernière mise à jour de la page: