Program for searching by indexing text files, Programming Languages

Assignment Help:

Write a program that can facilitate searching by indexing text files according to words. In this task, you are given a large text file, sample.txt, which you will need to index the words stored in them.

To do this, you will separate out the words in the text file and index them according to their frequencies.  Your program shall count the number of unique words and store them in an appropriate Standard Template Library container.  The words are to be normalized to lower-case so that we do not have to deal with case-sensitivity. Your program will ignore the following:

  • Punctuations
  • Numerical numbers (1, 2, etc., but 'one', 'two' are to be treated as words)

Next, your program shall generate two output files, index.txt, and common.txt. At the start of the program, you shall prompt user to enter the threshold number. This number determines if the unique words are to be stored in index.txt or common.txt.

Unique words with frequency greater or equal than the threshold are to be stored in common.txt. Likewise, unique words with frequency less than the threshold are to be stored in index.txt.

As an illustration, suppose a text file, sample.txt, contains the following:

Give us a break!  It is a beautiful day.  We do not want to do programming today.  Do you want to go to the beach with us?

At program starts:

Enter threshold number: 2

The above indicates that user enters 2 for threshold number. Your program shall generate the two output files with following content (words sorted in ascending order):

index.txt

Total words: 15

beach              1

beautiful          1

break               1

day                  1         

give                 1

go                    1

is                      1

it                      1

not                   1

programming  1

the                   1

today               1

we                   1

with                  1

you                  1

common.txt

Total words: 5

a                      2

do                    3

to                     3

us                    2

want                2


Related Discussions:- Program for searching by indexing text files

What do you understand by the term postback, Question: (a) What do you...

Question: (a) What do you understand by the term Postback? (b) Describe five benefits of creating virtual directories while developing an ASP.NET application. (c) Exp

Little Man Computer, 1) Write a program that takes an input value (for exam...

1) Write a program that takes an input value (for example, a number 5). The output should be sum of all numbers from 1 to the value input by the user (in this example, the output w

Service oriented architectures in xml, question 1: In the opening lecture I...

question 1: In the opening lecture I spoke about general changes in business - flatter organizations, process orientation as opposed to functional silos, focus on supply chains, gr

Ajax and php, would you like to see some of my code. I am trying to do an a...

would you like to see some of my code. I am trying to do an add friend request. when I view my friends page I click on the add button. with ajax I want the script to send over the

Information hiding, Why is this correct/when is this the right ide

Why is this correct/when is this the right idea

COS-101: INTRODUCTION TO COMPUTERS, COS-101: INTRODUCTION TO COMPUTERS PRO...

COS-101: INTRODUCTION TO COMPUTERS PROJECT DESCRIPTION The Computer Fundamentals Project provides you with a chance to apply what you have learned about computer fundamentals to

Program to present main frame contains of button, The program presents main...

The program presents mainframe contains two button new user, existing user PART ONE: CLICKING ON NEW USER New window appear contains: text field represent username,

I have to create three batch files, I have to create a bank atm machine usi...

I have to create a bank atm machine using MS-DOS. I am struggling to create LIST ALL ACCOUNTS ROLL-BACK TRANSACTIONS ACCESS AN ACCOUNT

How would you install a multiple layered security product, Problem: (i)...

Problem: (i) All Security authentication mechanism that run on the TRU64 Unix Operating system run under the Security Integration Architecture ( SIA ) layer. Explain with diag

C++, Have the user input two values. Store them in variables called saving...

Have the user input two values. Store them in variables called savings and expenses, both of data type double. If expenses is less than savings, subtract expenses from savings, o

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd