Develop a simple anti-virus that examines unknown binaries

Assignment Help Computer Engineering
Reference no: EM131940328

Foundations of Cybersecurity Project: Anti-virus

Description and Deliverables - In this project, you will gain hands-on experience with a core technique in defensive cybersecurity: signature matching. You will develop a simple anti-virus that (1) create signatures that match known malware, and then (2) examines unknown binaries to determine if they contain a malware signature. You will be provided with malware and benign binaries to help train your anti-virus.

To receive full credit for this project, you will turn in (at least) three things:

1. A program named av-train that analyzes some given binaries and produces signatures of malware.

2. A program named av-detect that analyzes some given binaries and determines if each one matches a malware signature or not

3. A Makefile that compiles your two programs (or is empty and does nothing, if you're using a language that doesn't require compilation).

Goals and Datasets - In this assignment, your goal is to develop a complete anti-virus system that maximizes true positives (malware detections) and true negatives (not detecting benign binaries), while also minimizing false negatives (malware that is missed) and false positives (benign binaries that are mistaken for malware). You will develop two programs: avtrain and av-detect, the former of which creates signatures from known binaries, and the latter of which uses the signatures to classify unknown binaries.

To achieve these goals, we have produced four datasets:

  • safe_pub.tar.gz: 3673 benign binaries (true negatives). Your anti-virus should never detect one of these binaries as malware (false positive).
  • malware_pub.tar.gz: 1360 malware binaries. Your anti-virus should create signatures from these binaries. It should also be able to detect all of them as malware (true positives) and miss none of them (false negatives).
  • safe_priv.tar.gz: An unknown number of benign binaries that we will use to evaluate your anti-virus.
  • malware_priv.tar.gz: An unknown number of malware binaries that we will use to evaluate your anti-virus.

In other words, you will use the two public datasets to develop, debug, and test your anti-virus system. In turn, we will evaluate and grade your system based on the two private datasets.

av-train -

The first program you will develop is av-train. This program takes three parameters as input: (1) a directory containing malware binaries, (2) a directory containing benign binaries, and (3) the name of a file that will contain the set of malware signatures that you derive from the given directory of malware. Obviously, your goal is to produce signatures that maximize true positives and true negatives, while minimizing false positives and false negatives.

av-detect -

The second program you will develop is av-detect. This program takes at least one, and possibly more, command line parameters:

$ ./av-detect <input signature file> [unknown binary 1] [unknown binary 2] ... [unknown binary n]

The first parameter is the signature file produced by your av-train program. All of the other parameters are unknown binaries: for each given unknown binary, your av-detect program should print to STDOUT (1) the name of the file and (2) whether it is "MALWARE" or "SAFE". Note that the first parameter (the signature file) is required; the list of unknown binaries is not required, and can be of any length.

Attachment:- Assignment File.rar

Reference no: EM131940328

Questions Cloud

Another significant factor influencing ­medium : In Congressional testimony, former Federal Reserve Chairman Ben Bernanke said: Another significant factor influencing ­medium- term trends in inflation
Analyze the validity and constitutionality of officer jones : Identify the constitutional amendment that would govern Officer Jones' actions. In your own opinion, discuss if you support his actions or not.
What can you learn from researching value investing : What can you learn from researching value investing? What is the depreciation for this schedule?
Define approaches for measuring performance : Define approaches for measuring performance, results and behavior. As the chosen project leader for the development of a new performance appraisal system.
Develop a simple anti-virus that examines unknown binaries : CS 2550 - Foundations of Cybersecurity Project: Anti-virus. You will develop a simple anti-virus that (1) create signatures that match known malware
What is the importance of the area to the palestinians : Why do you think that Israel is such an important place for the Jews? What is the importance of the area to the Palestinians?
Assesses the issues surrounding the collection of data : Develop a 10- to 15-slide Microsoft presentation that assesses the issues surrounding the collection, analysis, and utilization of statistical data in criminal
Discuss advantages of caste system in the south asia : Research more information beside the textbook and discuss advantages and disadvantages of caste system in the South Asia.
Government runs a budget surplus : What is the twin deficits idea? Did it hold for the United States in the 1990s? Briefly explain.

Reviews

len1940328

4/13/2018 3:16:38 AM

This project is due at 11:59pm on Tuesday 17. As part of this assignment you will download and decompress an archive that is full of live Linux malware. I repeat: THESE ARE LIVE MALICIOUS BINARIES. Under no circumstances should you execute these binaries, on any system. When you decompress the archive, your actual anti-virus program may flip out (because: malware!); you should tell your anti-virus to ignore the files, since you're smart and know not to run them.

len1940328

4/13/2018 3:16:34 AM

Submitting Your Project Before turning in the project, you must register yourself for our grading system using the following command: $ /course/cs2550/bin/register-student [NUID] NUID is your Northeastern ID number, including any leading zeroes. This command is available on all of the CCIS lab machines. The exact files that you submit for this assignment will vary depending on the programming language you choose to use. At a minimum, you will probably submit: A Makefile, which may be empty, The source code for your av-train program and The source code for your av-detect program.

len1940328

4/13/2018 3:16:29 AM

Grading - This project is worth 12% of your final grade, broken down as follows (out of 100): 20 points: turning in programs that correctly compile (if necessary), execute without error, support the specified command line syntax, and produce output in the specified format, 40 points: successfully identifying all malware in the private dataset as "MALWARE" and 40 points: successfully identifying all benign binaries in the private dataset as "SAFE". Points can be lost for turning in files in incorrect formats (e.g. not UNIX-line break ASCII), or failing to follow specified formatting and naming conventions.

len1940328

4/13/2018 3:16:23 AM

Notice the split between training and testing malware: the testing set contains malware that is similar, but not identical, to the malware in the training set. Your av-train program will not be given access to the testing malware. The signatures you calculate for the training set must be good enough to detect malware in both the training and testing sets.

len1940328

4/13/2018 3:16:19 AM

Bonus Points - This assignment contains a competitive element for bonus points. Specifically, the top k students whose av-train programs produce the smallest signature files will receive bonus points. k will be determined once all submissions are graded. The most concise signature files will receive 2% bonuses (on top of the 12% project grade); runner-ups with receive 1% bonuses.

Write a Review

Computer Engineering Questions & Answers

  Mathematics in computing

Binary search tree, and postorder and preorder traversal Determine the shortest path in Graph

  Ict governance

ICT is defined as the term of Information and communication technologies, it is diverse set of technical tools and resources used by the government agencies to communicate and produce, circulate, store, and manage all information.

  Implementation of memory management

Assignment covers the following eight topics and explore the implementation of memory management, processes and threads.

  Realize business and organizational data storage

Realize business and organizational data storage and fast access times are much more important than they have ever been. Compare and contrast magnetic tapes, magnetic disks, optical discs

  What is the protocol overhead

What are the advantages of using a compiled language over an interpreted one? Under what circumstances would you select to use an interpreted language?

  Implementation of memory management

Paper describes about memory management. How memory is used in executing programs and its critical support for applications.

  Define open and closed loop control systems

Define open and closed loop cotrol systems.Explain difference between time varying and time invariant control system wth suitable example.

  Prepare a proposal to deploy windows server

Prepare a proposal to deploy Windows Server onto an existing network based on the provided scenario.

  Security policy document project

Analyze security requirements and develop a security policy

  Write a procedure that produces independent stack objects

Write a procedure (make-stack) that produces independent stack objects, using a message-passing style, e.g.

  Define a suitable functional unit

Define a suitable functional unit for a comparative study between two different types of paint.

  Calculate yield to maturity and bond prices

Calculate yield to maturity (YTM) and bond prices

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd