# Cse 525 | Computer Science homework help

The genome of an organism can be expresses as some number G of
“base pairs” (see http://en.wikipedia.org/wiki/Base_pair). Typical
sizes of various genomes are given in (http://en.wikipedia.org/wiki/Genome).

String matching can be used to find particular sequences in a
genome. Several string matching algorithms are described
in (http://en.wikipedia.org/wiki/String_matching)

Consider a program to find to find if a particular sequence of
base pairs is found in a genome, and if so, where and how
many times.

Your program will run on a cluster with
the following properties:

Number of nodes – 20

Number of processors per node 16 2.6 GHz Xeon
Memory per node               16 GB
GPU – 2 (NVIDIA CUDA) per node, 1024 stream processors and 4GB RAM, running at 1.5 GHz
local drives 1 T SATA , 6 GB/sec
NFS drive 10TB  RAID, bandwidth limited by network

Switched Ethernet network
Latency               L = 20 microseconds
Bandwidth             B = 1Gb/sec == 100 Mbytes/sec for messages
larger than 32Kbytes

You may not need all the above information. If you feel you need some other system
property, feel free to assume some reasonable value (Try Wikipedia)

Assume the genome you are exploring and the sequence you are
trying to find, are both initially files on the NFS disk.

Deliverables:

1. Parallel String Match algorithm – in MPI, OpenMP, CUDA or some
combination of these. Description in English and/or pseudocode is
sufficient. Is yoyur algorithm data parallel, task parallel or both?

Describe data transfer during computation (disk to program, process to process,
CPU – GPU and node – node). Describe how data is partitioned between processes,
shared between processes, or replicated at each process.

2. You may not need all the hardware available for your algorithm. You may use the
entire cluster or any part of it. Describe what resources your algorithm will use to

3. Estimate how your algorithm would perform on the computer
system described above. Consider:
a. Complexity; communication costs.
b. Is there some file size (in bytes, number of elements, or both)
that is too small for your algorithm to work efficiently? Given the wide range of genome
sizes (see http://en.wikipedia.org/wiki/Genome), is there some range of size that you expect would
c. How much speedup would you exepect on the given hardware as compared to running

## Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
\$26
The price is based on these factors:
Number of pages
Urgency
Basic features
• Free title page and bibliography
• Unlimited revisions
• Plagiarism-free guarantee
• Money-back guarantee
On-demand options
• Writer’s samples
• Part-by-part delivery
• Overnight delivery
• Copies of used sources
Paper format
• 275 words per page
• 12 pt Arial/Times New Roman
• Double line spacing
• Any citation style (APA, MLA, Chicago/Turabian, Harvard)

# Our guarantees

Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

### Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

### Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

### Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.