IEEE VAST Challenge 2009

Judging Criteria

Judging Process and Criteria for Judging

Last updated, May 22nd, 2009
(Any significant changes made here will be reflected on the history page)

Submissions will be reviewed by judges both from the analytic community and from the visualization community.  All individual reviews will be collected and provided to participants. 

Both qualitative and quantitative scores will be computed. 

Accuracy metrics will evaluate the correctness of the data table answers based on the known “ground truth” embedded in the dataset.  These scores will be given to the teams. 

The qualitative metrics will evaluate the perceived utility of the system including the visualizations and the analytic process used.  These will be based on participant’s descriptive explanations (short or detailed answers, and the video). 

Due to the increased number of submissions, a two step judging process will be followed.  Detailed feedback on the short and detailed answers will thus be provided only after the initial round of judging. 

1st level peer review

Entries will be sent out for an initial peer review (similarly to paper reviews). 
Reviewers will be asked to view the video, read the text materials provided by the teams, and then answer the following questions using a 1-7 scale and providing comments:
1. How clearly does the submission explain how Visual Analytics tools were used to analyze the data and scenario?
2. Rate the usefulness, efficiency and intuitiveness of the analytical process
3. Rate the usefulness, efficiency and intuitiveness of the visualisations
4. Rate the usefulness, efficiency and intuitiveness of the interactions
5. How much novelty do you see in this submission (data processing, visualization, interaction, hypothesis generation or evaluation, overall process, etc…)? 
6. What was your overall satisfaction with the submission? 
7. Would you nominate this entry for one or more awards? 

Below are the instruction documents that the reviewers will be using for the review process.

- Review form

- Instructions for Reviewers

2nd level review

After the 1st round of reviews, the top entries will be reviewed in more detail by a team composed of the challenge committee and professional analysts.  The final scoring criteria used in the 2nd round are described below.

A- Accuracy

The correctness of answers to the questions and the evidence provided will be scored.  Participants will be given points for correct answers and penalized for incorrect answers. The correct answers will be based on the “ground truth” embedded in the data.  For example when we asked for the participants in a certain activity, they will be given points for finding those who did participate and penalized for missed participants or for identifying irrelevant people. 

1. Tables and lists:  Simple answers provided in the form of tables will be processed automatically and a score generated.  A generic suggestion for improving that score will be provided as well.

2. Text answers:  A human reader familiar with the solution will identify which of the ground truth elements (e.g. activities, groups, changes over time, etc.) have been identified in the submitted answers and if there is adequate evidence to support or refute the proposed hypotheses.  An overall score will be given for accuracy of each answer.. 

If participants report additional suspicious elements that were not part of the known ground truth,  the analysts who created the datasets will review these elements for legitimacy.  If they are deemed to be legitimate (i.e. supported by evidence), additional points will be awarded to the teams who discovered them.

B- Subjective assessment

The subjective assessment of the quality of the visualizations, interactions and support for the analytical process will be based on the text answers and the video participants provided.  The video will be used to judge the interactive aspects of the tools and the process used to arrive at the assessments.  Screens provided in the text answers will allow judges to rate the utility of the static visualizations and the utility of the visualizations in the analytics process.  Note that during these assessments the judges will not be able to ask the participants any questions so the clarity of the explanations provided in the text is critical. The judges cannot correctly assess something they do not understand.  The following criteria will be used for subjective assessments of the visualizations, interactions and support for the analytical process:

Primary criteria (the basis for the main score): 
• Utility of the tools - based the description of the specific INSIGHTS the tools helped discover
• Quality of the static representations (e.g. meaningful layout, good use of color or icons, good labeling, saliency of information, etc.)
• Quality of the interaction
• Support for the analytics process (e.g. support for hypothesis generation and evaluation)

Secondary criteria (i.e. criteria that are also very important and may be used in award nominations)
• Scalability (i.e. are some aspects of the analysis automated?  Are the results of the automatic processing understandable and believable?  Are there mechanisms to guide the analysis or the use of the tools?)
• Versatility (i.e. can the tools handle multiple data types?)
• Data integration (can data from various sources be integrated?)
• Handling of missing data and uncertainty
• Support for collaboration
• Learnability (note that the clarity of the explanations will have a strong impact here)
• Reporting (how is the preparation of the debrief supported)
• Other features such as a history mechanism, the ease of importing and exporting data, innovative features in general, and others.

C- Quality of the Debrief (GRAND Challenge only)

In addition to the accuracy rating (see above) analytics experts will award points for the quality of the debrief based on whether the debrief:
• Is written objectively
• Considers all available sources of intelligence (all datasets)
• Properly highlights caveats and expresses uncertainties or confidence in analytic judgments
• Properly distinguishes between underlying intelligence and analysts’ assumptions, judgments
• Incorporates alternative analysis where appropriate
• Contains logical argumentation.

Questions?  Send email to challengecommittee AT

Page 1 of 1 pages