What punctuation can be removed to determine terms

Assignment Help Computer Engineering
Reference no: EM131422208

Assignment

1. Consider this dictionary: {"CAT", "COUNT", "DOG", "DONKEY", "ELEPHANT" }

Term-ID1

Offset

1


2


3


4


5


a) Complete this table assuming "dictionary as a string"

b) Create a second dictionary consisting of each word reversed (e.g. CAT -> TAC ). Show the dictionary as a string.

Term-ID2

Offset

1


2


3


4


5


c) Complete this table using your reversed dictionary string

d) Using your two dictionaries, show how you can determine the words that satisfy the wildcard query C*T

Consider the following documents:

Doc1: the wood table

Doc2: they made the wood

Doc3: the table is made of steel

Doc4: wood table or steel table

 

 

 

Using a shingle size 2, compute the Jaccard coefficient of:

(Doc1, Doc2)

(Doc1, Doc3)

(Doc1, Doc4)

Based upon your results, Doc1 is most similar to ____?

1. Crawl-delay: 10
2. User-agent: crawlerbot
3. Disallow: /includes
4. Disallow: /misc
5. Disallow: /setup
6. Allow: /misc/*.jpg
7. User-agent: *
8. Disallow: /setup
2. Using this robots.txt file

a) What does line 1 mean?

b) What is the difference between line 2 and 7?

c) Should any crawler access the file /setup/help.txt?

d) Should the crawler "mybot" access the file /a/b.htm?

3. Consider the following text:

This tree is just one of many older-growth trees in the forest. Forests in Texas, can be over 100 years-old before they are considered "old". Trees can be over 200 years.

a) What punctuation can be removed to determine terms?

b) What stop words can be removed?

c) Which tokens can be converted to lower case?

Reference no: EM131422208

Questions Cloud

Functions in the small intestine : A protease is a digestive enzyme that breaks down proteins. Two important proteases are pepsin, which functions in the stomach, and trypsin, which functions in the small intestine. Would pepsin function in the small intestine? Would trypsin fun..
Elderly woman and a middle aged woman : An elderly woman and a middle aged woman who is quadriplegic may each have difficulties during cold weather. Explain why each person might have problems maintaining body heat
Hunting behavior and seen that females : Lessons from Apes: We have examined ape tool use and hunting behavior and seen that females are involved in tool use, that there may be evidence of cultural variation across chimp populations, and that many human behaviors are simply elaborations ..
Why is in important in good leadership : Great leaders recognize problems and do what it takes to overcome them. They are open and empathetic, and let their values guide their actions. Why is this often an overlooked quality in leaders? Why is in important in good leadership?
What punctuation can be removed to determine terms : What punctuation can be removed to determine terms? What stop words can be removed? Which tokens can be converted to lower case?
Reproductive difference between sporophytes and gametophytes : Identify one reproductive difference between sporophytes and gametophytes. How do gymnosperms vary from other seed-forming plants?
Do you think the technical conditions of probation : Do you think the technical conditions of probation and parole are "fair"? Why or why not? Do you think revocation (and subsequent incarceration) of offenders is an appropriate response to technical violations? Why or why not
Review some tools that are available for consumer : Complete an Internet search for data analytic tools to review some tools that are available for both consumer and professional use.Provide a brief summary of one of the tools.
Cuvette with a solution of enzyme : You fill a cuvette with a solution of enzyme, substrate, and dye that changes color as the reaction occurs. You put it in a spectrophotometer and record the absorbance at several time points.

Reviews

Write a Review

Computer Engineering Questions & Answers

  Mathematics in computing

Binary search tree, and postorder and preorder traversal Determine the shortest path in Graph

  Ict governance

ICT is defined as the term of Information and communication technologies, it is diverse set of technical tools and resources used by the government agencies to communicate and produce, circulate, store, and manage all information.

  Implementation of memory management

Assignment covers the following eight topics and explore the implementation of memory management, processes and threads.

  Realize business and organizational data storage

Realize business and organizational data storage and fast access times are much more important than they have ever been. Compare and contrast magnetic tapes, magnetic disks, optical discs

  What is the protocol overhead

What are the advantages of using a compiled language over an interpreted one? Under what circumstances would you select to use an interpreted language?

  Implementation of memory management

Paper describes about memory management. How memory is used in executing programs and its critical support for applications.

  Define open and closed loop control systems

Define open and closed loop cotrol systems.Explain difference between time varying and time invariant control system wth suitable example.

  Prepare a proposal to deploy windows server

Prepare a proposal to deploy Windows Server onto an existing network based on the provided scenario.

  Security policy document project

Analyze security requirements and develop a security policy

  Write a procedure that produces independent stack objects

Write a procedure (make-stack) that produces independent stack objects, using a message-passing style, e.g.

  Define a suitable functional unit

Define a suitable functional unit for a comparative study between two different types of paint.

  Calculate yield to maturity and bond prices

Calculate yield to maturity (YTM) and bond prices

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd