Rubric

Final Report Rubric
Final Report Rubric
Criteria Ratings Pts
Report Presentation
• Motivation:
– Does the report clearly outline of goals and questions addressed?
– Is the motivation for your task clear, plausible and rational?
– Is the problem statement well-defined using appropriate NLP terminology?
– Does the report state the importance, usefulness, benefits of the work and the results?

• Structure
– Does the report content flow logically?
– Is it sufficiently well-organized to omit information that should be common knowledge to your peers?
– Do you relegate less important information to an appropriate location (backmatter, software repository, footnote)?

• Visualization
– Does the report use appropriate figures, plots, and tables to justify preprocessing steps, design decisions, motivating discussions and explanations?

• Presentation (more important)
– Does the prose, references, sections and visuals all complement each other in describing the logical flow?
– Is any corpus analysis (exploratory data analysis) done purposefully, to motivate model or experimental design?
– Are any visuals appropriately-sized, captioned and legible? Do they serve to better explain the material than an equivalently-sized block of prose text?
– Do you correctly follow the formatting instructions, length limitations and submission rules?
Do not just report numbers, but illustrate (with figures, tables), and explain them. Do not assume that your audience knows what your numbers mean.
threshold: pts
75 Pts Full marks
0 Pts No marks
pts
75 pts
--
Report Content
· Originality
– What are the original elements done in the project? (It’s not necessary that no group has done your task before, but your report needs to reflect your ability to think analytically and contribute novel analysis.)
– Do you articulate how your work is novel in light of the prior work?

· Relevance
– How strongly connected is the project to this course?
— Do you use core concepts of NLP taught from class?

· Related Work

— How strongly connected is the project to this course?
— Do you use core concepts of NLP taught from class?
— Do you present a study of related work to the task? (Formal academic references, useful web articles and posts material, and other related work should be considered in this aspect. Remember to cite explicitly.)

· Technical Justification (more important):
— Is your technical approach suitable to try to solve your proposed problem?
— Is your technical approach valid for your task and dataset?
— Are there technical flaws in the execution of the approach?
— Do you describe the data / corpora that you collected in an appropriate manner? (Self-annotated data may need evidence that the annotations are replicable; i.e., interannotator agreement)
— Are evaluations performed with the appropriate metrics and correctly interpreted?

· Implementation (more important):
— Did you implement multiple models (baseline, and best)?
— Do you cleanly delineate what your group members coded as original work from public library or code repositories you used from others?
— Did you implement or use the models correctly?
— Did you tune them appropriately, where resources allowed?

· Model Evaluation (more important):
— Do you address both macroscopic, dataset-wide level performance (e.g., F1 measures) as well as microscopic, individual instance level performance (careful error analysis with diagnosis)?
— Do you demonstrate improvement in performance from your model to another, such as a baseline model? (A baseline model may be an implementation of a simpler model or version of your model, or referenced from other literature --- make sure to give appropriate citations).

Note that your performance need not be very high (e.g., 90\%) if your data problem is hard. But you should show improvement over some baseline approach. This includes conscientious efforts to improve performance.

· Results Interpretation (10\%): How well are the evaluation results described and interpreted.

— Error analysis: Explain, with evidence, why the model may be performing poorly (or not as good as you wish).
— Do you justify technically why your model is good or has improved? I.e., rationalize your approach’s performance effectiveness.
— Future improvements: Discuss how you may further improve your model.
You do not have to implement or test all your ideas, if too infeasible. Though discussing them helps to show your grading staff that you have good and valid ideas.
threshold: pts
180 Pts Full marks
0 Pts No marks
pts
180 pts
--
Miscellaneous
• Reproducibility
– Is the technical approach described clearly and sufficiently detailed for a peer to replicate? Is the evaluation method described clearly and detailed enough for a peer to replicate?
– Is your source code well-organized and any ancillary materials well documented?
– Are your results easy to replicate by running documented commands or executing a notebook?

• Limitations
– Do you state the principal limitations of your work, such as the important aspects of the problem domain, and how these factors might be mitigated?
• Backmatter
– Do you use the backmatter and supplemental materials (website, source code repository) effectively to complement the formal report body?
– Are your references bibliographically complete?
– Did your group appropriately fill out the Statement of Independent Work?
– Did you properly acknowledge and document how AI tools played an appropriate role in your experimentation, coding and report?
threshold: pts
45 Pts Full marks
0 Pts No marks
pts
45 pts
--