Implementing autocomplete with a trie

Assignment Help Computer Engineering
Reference no: EM131269818

Introduction to Computer Science Assignment

Question 1: Implementing Autocomplete with a Trie   

In this question, you will work on the problem of storing a set of n words (also called keys) in a tree data structure, and a method for efficiently finding all the words that contain a given prefix. By word or key here, we just mean a string formed from some set of ASCII characters. (If you are unfamiliar with ASCII, see https://www.asciitable.com/. ASCII encodes text using one byte or 8 bits per character.)

You should be familiar with the above problem through the autocomplete feature found on some cell phones, web forms, Eclipse. For instance, if you type in "aar", then autocomplete may suggest aardvark, aardvarks, aardwolf, aardwolves, aargh. 

Trie Data Structure

A trie is a type of rooted tree used to efficiently index keys by storing their prefixes. (The term trie comes from the word "retrieval".   It is sometimes pronounced "try" although many people pronounce it the same as "tree".) The main property of a trie is that each edge corresponds to a character. Thus the path from the root of the trie to any node in the trie defines a string.  The string defined by each node is a prefix of all strings defined by the descendents of that node.

Below is an example trie. It contains the following keys: a, and, ax, dog, door, dot.  You will note that the dashed nodes correspond to prefixes in the trie that are not in our list of keys (an, d, do, doo). The trie must also keep track of this distinction by storing for each node whether it corresponds to a key or not.

Note that if C is the set of all possible characters defined at the edges, then each node can have at most  k = | C | children, where the | C |  notation just means the number of elements in set C.

1958_Figure.png

What You Need To Do

Read the provided code (Trie.java, AutoComplete.java), including the comments which explain what the various methods do.     This will take you some time, so go slow and read the code carefully.  Note that the Trie class has a private inner class TrieNode.

For the Trie class, fill in the missing code for the following methods:

  • createChild(), toString() for the inner class TrieNode
  • getPrefixNode(), insert(), contains(), getAllPrefixMatches()

We suggest you implement them in the order listed above.  You may write helper methods, but if you do then add a comment explaining what you are doing so that the grader can easily follow.  

You should use Java String methods length() and charAt().  But you may not use String searching methods such as String.startsWith(),  String.substring(), etc. The point of the assignment is for you to organize and compare strings use a trie data structure.

Submit only the file Trie.java as a zip file.  Use the AutoComplete.java file to test your code, but do not submit this tester file.

Question 2: Running time analysis of Tries

Here you will analyze the complexity of two methods, namely loadKeys() and contains().

Suppose you are given a set of K keys, where each key is of length at most L characters. Let the keys be represented by a trie T1 as implemented in Question 1.  Let C be the maximum number of children of a node.  Although C is a constant NUMCHILDREN in the code, we are asking you to treat it as a variable in these questions.

Answer the following questions. In each case, give the tightest bound. For example, if the bound was O(L) and you wrote O(KL), then your answer would not be the tightest bound. In addition, you must provide a short written justification for your answer.

In each case, your answer should be in terms of the variables K, L, C.

1) What is the O( ) bound of loadKeys() on T1?

2) What is the O( ) bound of contains() on T1?

In the TrieNode inner class, the children of a node is implemented using an array of size C. In many tree data structures, however, the set of children is implemented using a linked list. Suppose we were to modify the TrieNode inner class so that each node's children were implemented using a linked list. Let T2 denote the modified trie data structure.  

3) What is the O( ) bound of loadKeys() on T2?

4) What is the O( ) bound of contains() on T2?

We next compare the runtime of tries to the runtimes of binary search trees (BST). Consider a BST data structure that stores the same set of K keys. Each node in this BST would correspond to a key, such that the left (or right) child of each node is lexicographically smaller (or bigger) than its parent.  Note there is no prefix representation with a BST.

Note that the running time of operations on a BST depends on how "balanced" the tree is. (Roughly speaking, a tree is well balanced if the number of descendents of each left child is approximately the same as the number of descendents of each right child.)

5) What is the O( ) bound of loadKeys() for a BST?

6) What is the : ( ) bound of loadKeys() for a BST?

7) What is the O( ) bound of contains() for a BST?

8) What is the : ( ) bound of contains() for a BST?

Assume that these methods perform the same role as they do for tries i.e. loadKeys() populates a BST, and contains() searches for a key in BST.

Hint: Comparing the lexicographic order of two strings of length L requires at least one and at most L comparisons.

What You Need To Do

Submit a text file A3Q2.txt with answers to the questions. Feel free to write Omega( ) instead of Ω( ). Include this text file in your A3.zip folder which you submit.

Attachment:- Assignment.zip

Reference no: EM131269818

Questions Cloud

How will you incorporate principles of the theory into plan : How will you incorporate the principles of this theory into a lesson plan? Provide a minimum of three examples (activities or strategies). How will you incorporate the principles of this theory into your assessment plan?
Difference between horizontal and vertical partitioning : What is the difference between horizontal and vertical partitioning? What is their common advantage? Are their disadvantages the same or different?
A randomly selected microwave oven lasts at most 6 years : Find the probability that a randomly selected microwave oven lasts at most 6 years.-  Find the probability that a randomly selected microwave oven lasts from 6 to 12 years.
Identify three most challenging students in your classroom : Identify your three most challenging students in your classroom. Challenging may be defined as disruptive, underperforming, significantly economically disadvantaged, or any other situation that impedes student achievement.
Implementing autocomplete with a trie : COMP 250: Introduction to Computer Science Assignment. Implementing Autocomplete with a Trie  In this question, you will work on the problem of storing a set of n words (also called keys) in a tree data structure, and a method for efficiently findi..
Describe the leadership characteristics possessed : Describe the leadership characteristics possessed by this person, or explain which important characteristics this person lacks.
Describe policy issue for your selected role specialization : Describe a policy issue for your selected role specialization. Synthesize knowledge for values theory, ethics, and legal regulatory statutes, and develop a personal philosophy that will map out a policy strategy that uses a high degree of politica..
What patterns emergedfrom the recording of your media : What patterns emergedfrom the recording of your media? Did you seem to do the same things in the mornings or at night? Were weekdays different than weekends? Did you tend to use more than one medium at the same time?
Make two equal lump-sum deposits : The president of a company wants to make two equal lump-sum deposits, one two years and the second four years from now, so he can make five $100 per year withdrawals staring when the second deposit is made. Further he plans to withdraw an additional ..

Reviews

Write a Review

Computer Engineering Questions & Answers

  Robotics research project

Write a paper on a topic of your choice in the area of robotics.

  Suppose you are put in charge of launching a new web site

assume you are put in charge of launching a new web site for a local nonprofit organization. what costs would you need

  Demonstrate demorgan laws using a venn diagram

Demonstrate DeMorgan's Laws using a Venn diagram. Draw a Venn diagram showing the elements of sets A, B, and the universe for all 4 regions. Draw a second diagram showing only the elements of the complement of set A

  Riordan manufacturing is a global plastics manufacturer

riordan manufacturing is a global plastics manufacturer employing 550 people with projected annual earnings of 46

  Ehy the focus of the os is all over the place

Functionally, we'll definitely find that Unix or Linux will be the best choice for a web server as it doesn't require all of the "extras" for the graphics. What problems can we have if the focus of the OS is all over the place?

  Operation for circular linked list

Write down an algorithm or code segment for searching the circular linked list for a given item. Write down an algorithm or the code segment for locating nth successor of an item within a circular linked list (nth item which follows the given item ..

  Explain the evolution of machine language

consider the evolution of machine language from generation one to fifth generation programming language. Also Explain what you think about the future of programming language?

  Systems analyst at a manufacturing company in seattle

You are the Systems Analyst at a manufacturing company in Seattle, WA. A Systems Analyst in your company's New York office sends you a trace file to analyze. The complaint is that an end-users machine cannot connect to any of the network devices o..

  Describe some of its essential characteristics or components

Is this new rage just a fad, a nebulous idea or a far-reaching trend? In 500+ words define this technology. List and describe some of its essential characteristics or components. Name two providers of this service and explain their package

  What kinds of ethical issues and information security

1. what types of ethical issues and information security issues are common in organizations?2. how can a company

  Consider a big network with many desktops laptops and

consider a large network with many desktops laptops and networked printers. what are the advantages and disadvantages

  In 200 - 300 words discuss two major issues involved with

in 200 - 300 words discuss two major issues involved with acquiring systems and two major issues commonly faced when

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd