Demonstrate knowledge about problem

Assignment Help Computer Engineering
Reference no: EM131624742 , Length: word count:1000

Overview

The goal of this Project is to assess the performance of some spelling correction methods on the problem of tweet normalisation, and to express the knowledge that you have gained in a technical report. This aims to reinforce concepts in approximate matching and evaluation, and to strengthen your skills in data analysis and problem solving.

1. One or more programs, implemented in one or more programming languages, which must:
- Determine the best match(es) for a token, with respect to a reference collection (dictionary)
- Process the data input file(s), to determine the best match for each token
- Evaluate the matches, with respect to the truly intended words, using one or more evaluation metrics

2. A README that briefly details how your program(s) work(s). You may use any external re- sources for your program(s) that you wish: you must indicate these, and where you obtained them, in your README. The program(s) and README are required submission elements, but will not typically be directly assessed.

3. A technical report, of 1000-1600 words, which must:
- Give a short description of the problem and data set
- Briefly summarise some relevant literature
- Briefly explain the approximate matching technique(s), and how it is (they are) used
- Present the results, in terms of the evaluation metric(s) and illustrative examples
- Contextualise the system's behaviour, based on the (admittedly incomplete) understanding from the subject materials
- Clearly demonstrate some knowledge about the problem

Terms of Use

By using this data, you are becoming part of the research community - consequently, as part of your commitment to Academic Honesty, you must cite the curators of the dataset in your report, as the following publication:

Bo Han and Timothy Baldwin (2011) Lexical normalisation of short text messages: Makn sens a #twitter. In Proceedings of the 49th Annual Meeting of the Association for Compu- tational Linguistics, Portland, USA. pp. 368-378.

Reports that do not cite this work constitute plagiarism, and will be correspondingly assigned a mark of 0.

Please note that the dataset is a sub-sample of actual data posted to Twitter, with almost no filtering whatsoever. Unfortunately, the Internet is a place where freedom of speech is both empowering and harmful: consequently, some of the information expressed in the tweets is undoubtedly in poor taste. We would ask you to please look beyond this to the task at hand, as much as possible. (For example, it is generally not necessary to actually read the tweets themselves.)

Reference no: EM131624742

Questions Cloud

How should athens classify this lease why : Collectibility of the remaining lease payments is reasonably assured, and Corinth has no material cost uncertainties. How should Athens classify this lease
Liabilities under common law and the securities act : Liability under Common Law and the Securities Act of 1933. Butler Manufacturing Corporation raised capital for a plant expansion by borrowing from a bank.
What are the common side effects : What are the indications and contraindications for taking this medication? What are the common side effects?
What are the particulars of the lawsuit : Class Action Lawsuits. In the United States, it has become common to seek recovery of financial losses from other parties, often even if that other party.
Demonstrate knowledge about problem : Project: Lexical Normalisation of Twitter Data - Determine the best match(es) for a token, with respect to a reference collection (dictionary)
Calculate the net present value of this project : Star Corporation, an amusement park, is considering a capital investment in a new exhibit. Calculate the net present value of this project to the company
Explain how the article relates to concepts studied : A discussion, in your own words and ideas, containing specific points on how the article relates to concepts studied in class this week.
Accountants wanted to start an accounting company : Suppose a group of accountants wanted to start an accounting company. What are the various organizational forms of business the accountants should consider?
Discuss the liability under the securities acts : Liability under the Securities Acts. Jones, CPA, audits a number of public -companies. During the past year, some deficiencies with respect to audits conducted.

Reviews

len1624742

9/2/2017 6:43:58 AM

Changes/Updates to the Project Specifications If we require any (hopefully small-scale) changes or clarifications to the project specifications, they will be posted on the LMS. Any addendums will supersede information included in this document. Academic Misconduct For most people, collaboration will form a natural part of the undertaking of this project. However, it is still an individual task, and so reuse of ideas or excessive influence in algorithm choice and de- velopment will be considered cheating. We will be checking submissions for originality and will invoke the University’s Academic Misconduct policy (http://academichonesty.unimelb. edu.au/policy.html) where inappropriate levels of collusion or plagiarism are deemed to have taken place.

len1624742

9/2/2017 6:43:42 AM

Critical Analysis: (50% of the marks available) You will explain the practical behaviour of your systems, referring to the theoretical behaviour where appropriate. You will support your observations with evidence, in terms of illustrative examples and evaluation metrics. You will derive some knowledge about the problem of tweet normalisation. Report Quality: (30% of the marks available) You will produce a formal report, which is commensurate in style and structure with a (short) research paper. You must express your ideas clearly and concisely, and remain within the word limit (1000-1600 words). You will include a short summary of related research. We will post a marking rubric to indicate what we will be looking for in each of these categories when marking.

len1624742

9/2/2017 6:43:24 AM

The opinions expressed within the tweets in no way express the official views of the University of Melbourne or any of its employees; using the data in a teaching capacity does not constitute endorsement of the views expressed within. The University accepts no responsibility for offence caused by any content contained within this data. If you object to these Terms, please contact us as soon as possible. Assessment Criteria Method: (20% of the marks available) You will attempt a representative sample of approximate matching techniques, which is adequate for deriving some knowledge about the problem of tweet normalisation. You will evaluate your method(s) formally.

len1624742

9/2/2017 6:43:17 AM

Terms of Use By using this data, you are becoming part of the research community — consequently, as part of your commitment to Academic Honesty, you must cite the curators of the dataset in your report, as the following publication: Bo Han and Timothy Baldwin (2011) Lexical normalisation of short text messages: Makn sens a #twitter. In Proceedings of the 49th Annual Meeting of the Association for Compu- tational Linguistics, Portland, USA. pp. 368–378. Reports that do not cite this work constitute plagiarism, and will be correspondingly assigned a mark of 0. Please note that the dataset is a sub-sample of actual data posted to Twitter, with almost no filtering whatsoever. Unfortunately, the Internet is a place where freedom of speech is both empowering and harmful: consequently, some of the information expressed in the tweets is undoubtedly in poor taste. We would ask you to please look beyond this to the task at hand, as much as possible. (For example, it is generally not necessary to actually read the tweets themselves.)

Write a Review

Computer Engineering Questions & Answers

  Difference between assembly and high-level languages

What is the basic difference between assembly and high-level languages? Why would you choose one over the other?

  Write a getter method and a setter method for age

Suppose you have a class called Movie. Write a constructor for the class that initializes the title and director instance variables based on parameters passed.

  Computer architecture 1 give the register transfer notation

computer architecture 1 give the register transfer notation for a simple calculator which supports the operations -

  Problem related to the prevent thermal burn hazards

A spherical tank, with an inner diameter of 3 m, is filled with a solution undergoing an exothermic reaction that generates 233 W/m3 of heat and causes.

  Develop a requirements definition for the new system

Develop a requirements definition for the new system. Include both functional and nonfunctional system requirements. Pretend that you will release the system in three different versions. Prioritize the requirements accordingly.

  Find the budget areas and the resulting balance

You are to make a budgeting report for a local company using a C++ program. There are two input files. The first input file lists the individual areas a budget has been defined for. Two of these are two checking accounts where the budget is the am..

  Write a function alllntersect

Modify ShowFmins so that for a given starting value, it reports the number of iterations required when Fmins is run with steptol = ftol = lO-d for d = 0:6.

  Show the mortgage payment amount

Write down a program in Java (without graphical interface) using a loan amount of $200,000 with an interest rate of 5.75% and a 30 year loan. Display the mortgage payment amount and then list the loan balance and interest paid for each payment ove..

  Find the minimum sop from using the qm procedure

Find the minimum SOP from using the QM procedure. Find the minimum SOP from assuming there are not don't-cares.

  Testing program using numbers of command line arguments

Test your program thoroughly utilizing different numbers of command line arguments.

  Questionmodify this function to check to see if one list is

questionmodify this function to check to see if one list is a shallow copy of other.def firstmismatchlst1 lst2-list of

  Find out the minimum value stored in field named fldtotal

Using an array of DataRow objects named drArray, assume that the first field has the name fldTotal. Write down a loop to examine each row in the array, and find the minimum value stored in the field named fldTotal. Store the result in the variable..

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd