国产人妻人伦精品_欧美一区二区三区图_亚洲欧洲久久_日韩美女av在线免费观看

合肥生活安徽新聞合肥交通合肥房產生活服務合肥教育合肥招聘合肥旅游文化藝術合肥美食合肥地圖合肥社保合肥醫院企業服務合肥法律

COM6521代做、代寫c/c++編程設計

時間:2024-04-26  來源:合肥網hfw.cc  作者:hfw.cc 我要糾錯



COM4521/COM6521 Parallel Computing with
Graphical Processing Units (GPUs)
Assignment (80% of module mark)
Deadline: 5pm Friday 17th May (Week 12)
Starting Code: Download Here
Document Changes
Any corrections or changes to this document will be noted here and an update
will be sent out via the course’s Google group mailing list.
Document Built On: 17 January 2024
Introduction
This assessment has been designed against the module’s learning objectives. The
assignment is worth 80% of the total module mark. The aim of the assignment is
to assess your ability and understanding of implementing and optimising parallel
algorithms using both OpenMP and CUDA.
An existing project containing a single threaded implementation of three algorithms has been provided. This provided starting code also contains functions
for validating the correctness, and timing the performance of your implemented
algorithms.
You are expected to implement both an OpenMP and a CUDA version of each of
the provided algorithms, and to complete a report to document and justify the
techniques you have used, and demonstrate how profiling and/or benchmarking
supports your justification.
The Algorithms & Starting Code
Three algorithms have been selected which cover a variety of parallel patterns for
you to implement. As these are independent algorithms, they can be approached
in any order and their difficulty does vary. You may redesign the algorithms in
1
your own implementations for improved performance, providing input/output
pairs remain unchanged.
The reference implementation and starting code are available to download from:
https://codeload.github.com/RSE-Sheffield/COMCUDA_assignment_c614d9
bf/zip/refs/heads/master
Each of the algorithms are described in more detail below.
Standard Deviation (Population)
Thrust/CUB may not be used for this stage of the assignment.
You are provided two parameters:
• An array of floating point values input.
• The length of the input array N.
You must calculate the standard deviation (population) of input and return a
floating point result.
The components of equation 1 are:
• σ: The population standard deviation

P = The sum of..
• xi = ..each value
• µ = The mean of the population
• N: The size of the population
σ =
sPN
i=1(xi − µ)
2
N
(1)
The algorithm within cpu.c::cpu_standarddeviation() has several steps:
1. Calculate the mean of input.
2. Subtract mean from each element of input.
3. Square each of the resulting elements from the previous step.
4. Calculate the sum of the resulting array from the previous step.
5. Divide sum by n.
6. Return the square root of the previous step’s result.
It can be executed either via specifying a random seed and population size, e.g.:
<executable> CPU SD 12 100000
Or via specifying the path to a .csv input file, e.g.:
<executable> CPU SD sd_in.csv
2
Convolution
You are provided four parameters:
• A 1 dimensional input array input image.
• A 1 dimensional output array output image.
• The width of the image input.
• The height of the image input.
Figure 1: An example of a source image (left) and it’s gradient magnitude (right).
You must calculate the gradient magnitude of the greyscale image input. The
horizontal (Gx) and vertical (Gy) Sobel operators (equation 2) are applied to
each non-boundary pixel (P) and the magnitude calculated (equation 3) to
produce a gradient magnitude image to be stored in output. Figure 1 provides
an example of a source image and it’s resulting gradient magnitude.

(3)
A convolution is performed by aligning the centre of the Sobel operator with a
pixel, and summing the result of multiplying each weight with it’s corresponding
pixel. The resulting value must then be clamped, to ensure it does not go out of
bounds.

The convolution operation is demonstrated in equation 4. A pixel with value
5 and it’s Moore neighbourhood are shown. This matrix is then componentwise multiplied (Hadamard product) by the horizontal Sobel operator and the
components of the resulting matrix are summed.
Pixels at the edge of the image do not have a full Moore neighbourhood, and
therefore cannot be processed. As such, the output image will be 2 pixels smaller
in each dimension.
The algorithm implemented within cpu.c::cpu_convolution() has four steps
performed per non-boundary pixel of the input image:
1. Calculate horizontal Sobel convolution of the pixel.
2. Calculate vertical Sobel convolution of the pixel.
3. Calculate the gradient magnitude from the two convolution results
4. Approximately normalise the gradient magnitude and store it in the output
image.
It can be executed via specifying the path to an input .png image, optionally a
second output .png image can be specified, e.g.:
<executable> CPU C c_in.png c_out.png
Data Structure
You are provided four parameters:
• A sorted array of integer keys keys.
• The length of the input array len_k.
• A preallocated array for output boundaries.
• The length of the output array len_b.
You must calculate the index of the first occurrence of each integer within the
inclusive-exclusive range [0, len_b), and store it at the corresponding index in
the output array. Where an integer does not occur within the input array, it
should be assigned the index of the next integer which does occur in the array.
This algorithm constructs an index to data stored within the input array, this is
commonly used in data structures such as graphs and spatial binning. Typically
there would be one or more value arrays that have been pair sorted with the key
array (keys). The below code shows how values attached to the integer key 10
could be accessed.
for (unsigned int i = boundaries[10]; i < boundaries[11]; ++i) {
float v = values[i];
// Do something
}
The algorithm implemented within cpu.c::cpu_datastructure() has two
steps:
4
1. An intermediate array of length len_b must be allocated, and a histogram
of the values from keys calculated within it.
2. An exclusive prefix sum (scan) operation is performed across the previous
step’s histogram, creating the output array boundaries.
Figure 2 provides a visual example of this algorithm.
0 1 1 3 4 4 4
0 1 3 3 **
1 2 0 1 3
+ + + + + + +
+ + + + + + + + + +
keys
histogram
boundaries
0 1 2 3 4 5 6
0 1 2 3 4
0 1 2 3 4 5
Figure 2: An example showing how the input keys produces boundaries in the
provided algorithm.
It can be executed via specifying either a random seed and array length, e.g.:
<executable> CPU DS 12 100000
Or, via specifying the path to an input .csv, e.g.:
<executable> CPU DS ds_in.csv
Optionally, a .csv may also be specified for the output to be stored, e.g.:
<executable> CPU DS 12 100000 ds_out.csv
<executable> CPU DS ds_in.csv ds_out.csv
The Task
Code
For this assignment you must complete the code found in both openmp.c
and cuda.cu, so that they perform the same algorithm described above
and found in the reference implementation (cpu.c), using OpenMP and
CUDA respectively. You should not modify or create any other files within
the project. The two algorithms to be implemented are separated into 3
methods named openmp_standarddeviation(), openmp_convolution() and
openmp_datastructure() respectively (and likewise for CUDA).
You should implement the OpenMP and CUDA algorithms with the intention of
achieving the fastest performance for each algorithm on the hardware that you
5
use to develop and test your assignment.
It is important to free all used memory as memory leaks could cause the
benchmark mode, which repeats the algorithm, to run out of memory.
Report
You are expected to provide a report alongside your code submission. For each of
the 6 algorithms that you implement you should complete the template provided
in Appendix A. The report is your chance to demonstrate to the marker that
you understand what has been taught in the module.
Benchmarks should always be carried out in Release mode, with timing
averaged over several runs. The provided project code has a runtime argument
--bench which will repeat the algorithm for a given input 100 times (defined
in config.h). It is important to benchmark over a range of inputs, to allow
consideration of how the performance of each stage scales.
Deliverables
You must submit your openmp.c, cuda.cu and your report document
(e.g. .pdf/.docx) within a single zip file via Mole, before the deadline. Your
code should build in the Release mode configuration without errors or warnings
(other than those caused by IntelliSense) on Diamond machines. You do not
need to hand in any other project or code files other than openmp.c, cuda.cu.
As such, it is important that you do not modify any of the other files provided
in the starting code so that your submitted code remains compatible with the
projects that will be used to mark your submission.
Your code should not rely on any third party tools/libraries except for those
introduced within the lectures/lab classes. Hence, the use of Thrust and CUB is
permitted except for the standard deviation algorithm.
Even if you do not complete all aspects of the assignment, partial progress should
be submitted as this can still receive marks.
Marking
When marking, both the correctness of the output, and the quality/appropriateness of the technique used will be assessed. The report
should be used to demonstrate your understanding of the module’s theoretical
content by justifying the approaches taken and showing their impact on the
performance. The marks for each stage of the assignment will be distributed as
follows:
6
OpenMP (30%) CUDA (70%)
Stage 1 (**%) 9.6% 22.4%
Stage 2 (34%) 10.2% 23.8%
Stage 3 (34%) 10.2% 23.8%
The CUDA stage is more heavily weighted as it is more difficult.
For each of the 6 stages in total, the distribution of the marks will be determined
by the following criteria:
1. Quality of implementation
• Have all parts of the stage been implemented?
• Is the implementation free from race conditions or other errors regardless
of the output?
• Is code structured clearly and logically?
• How optimal is the solution that has been implemented? Has good hardware
utilisation been achieved?
2. Automated tests to check for correctness in a range of conditions
• Is the implementation for the specific stage complete and correct (i.e. when
compared to a number of test cases which will vary the input)?
3. Choice, justification and performance reporting of the approach towards
implementation as evidenced in the report.
• A breakdown of how marks are awarded is provided in the report structure
template in Appendix A.
These 3 criteria have roughly equal weighting (each worth 25-40%).
If you submit work after the deadline you will incur a deduction of 5% of the
mark for each working day that the work is late after the deadline. Work
submitted more than 5 working days late will be graded as 0. This is the same
lateness policy applied university wide to all undergraduate and postgraduate
programmes.
Assignment Help & Feedback
The lab classes should be used for feedback from demonstrators and the module
leaders. You should aim to work iteratively by seeking feedback throughout the
semester. If leave your assignment work until the final week you will limit your
opportunity for feedback.
For questions you should either bring these to the lab classes or use the course’s
Google group (COM452**group@sheffield.ac.uk) which is monitored by the
course’s teaching staff. However, as messages to the Google group are public to
7
all students, emails should avoid including assignment code, instead they should
be questions about ideas, techniques and specific error messages rather than
requests to fix code.
If you are uncomfortable asking questions, you may prefer to use the course’s
anonymous google form. Anonymous questions must be well formed, as there is
no possibility for clarification, otherwise they risk being ignored.
Please do not email teaching assistants or the module leader directly for assignment help. Any direct requests for help will be redirected to the above
mechanisms for obtaining help and support.
8
Appendix A: Report Structure Template
Each stage should focus on a specific choice of technique which you have applied
in your implementation. E.g. OpenMP Scheduling, OpenMP approaches for
avoiding race conditions, CUDA memory caching, Atomics, Reductions, Warp
operations, Shared Memory, etc. Each stage should be no more than 500 words
and may be far fewer for some stages.
<OpenMP/CUDA>: Algorithm <Standard Deviation/Convolution/Data Structure>
Description
• Briefly describe how the stage is implemented focusing on what choice of
technique you have applied to your code.
Marks will be awarded for:
• Clarity of description
Justification
• Describe why you selected a particular technique or approach. Provide
justification to demonstrate your understanding of content from the
lectures and labs as to why the approach is appropriate and efficient.
Marks will be awarded for:
• Appropriateness of the approach. I.e. Is this the most efficient choice?
• Justification of the approach and demonstration of understanding
Performance
Size CPU Reference Timing (ms) <Mode> Timing (ms)
• Decide appropriate benchmark configurations to best demonstrate scaling
of your optimised algorithm.
• Report your benchmark results, for example in the table provided above
• Describe which aspects of your implementation limits performance? E.g.
Is your code compute, memory or latency bound on the GPU? Have you
performed any profiling? Is a particular operation slow?
• What could be improved in your code if you had more time?
Marks will be awarded for:
9
• Appropriateness of the used benchmark configurations.
• Does the justification match the experimental result?
• Have limiting factors of the code been identified?
• Has justification for limiting factors been described or evidenced

?請加QQ:99515681  郵箱:99515681@qq.com   WX:codinghelp

掃一掃在手機打開當前頁
  • 上一篇:菲律賓工作只能使用9G工作簽證嗎 如何辦理9G工簽
  • 下一篇:COMP222代寫、Python, Java程序語言代做
  • 無相關信息
    合肥生活資訊

    合肥圖文信息
    流體仿真外包多少錢_專業CFD分析代做_友商科技CAE仿真
    流體仿真外包多少錢_專業CFD分析代做_友商科
    CAE仿真分析代做公司 CFD流體仿真服務 管路流場仿真外包
    CAE仿真分析代做公司 CFD流體仿真服務 管路
    流體CFD仿真分析_代做咨詢服務_Fluent 仿真技術服務
    流體CFD仿真分析_代做咨詢服務_Fluent 仿真
    結構仿真分析服務_CAE代做咨詢外包_剛強度疲勞振動
    結構仿真分析服務_CAE代做咨詢外包_剛強度疲
    流體cfd仿真分析服務 7類仿真分析代做服務40個行業
    流體cfd仿真分析服務 7類仿真分析代做服務4
    超全面的拼多多電商運營技巧,多多開團助手,多多出評軟件徽y1698861
    超全面的拼多多電商運營技巧,多多開團助手
    CAE有限元仿真分析團隊,2026仿真代做咨詢服務平臺
    CAE有限元仿真分析團隊,2026仿真代做咨詢服
    釘釘簽到打卡位置修改神器,2026怎么修改定位在范圍內
    釘釘簽到打卡位置修改神器,2026怎么修改定
  • 短信驗證碼 豆包網頁版入口 破天一劍 目錄網 排行網

    關于我們 | 打賞支持 | 廣告服務 | 聯系我們 | 網站地圖 | 免責聲明 | 幫助中心 | 友情鏈接 |

    Copyright © 2025 hfw.cc Inc. All Rights Reserved. 合肥網 版權所有
    ICP備06013414號-3 公安備 42010502001045

    国产人妻人伦精品_欧美一区二区三区图_亚洲欧洲久久_日韩美女av在线免费观看
    影音先锋欧美在线| 欧美激情欧美激情在线五月| 国产一区二区三区四区五区加勒比 | 国产热re99久久6国产精品| 日韩亚洲欧美中文在线| 日韩亚洲欧美中文在线| 国产精品一二三视频| 国产欧美一区二区三区在线看| 午夜视频在线瓜伦| 国产又粗又猛又爽又黄的网站| 国产乱码精品一区二区三区卡 | 激情视频一区二区| 午夜免费日韩视频| 91免费看片在线| 亚洲精品中文综合第一页| 国产精品久久久久久久久久久新郎 | 欧美日韩精品免费看| 激情内射人妻1区2区3区| 国产一区二区三区黄| 国内免费精品永久在线视频| 含羞草久久爱69一区| 国产天堂视频在线观看| 久久99国产精品99久久| 国产精品久久久久免费a∨| 欧美精品激情在线| 国产福利视频在线播放| 国产美女主播在线| 久久国产一区二区| 成人精品视频在线| 国产精品影片在线观看| 久久九九免费视频| 日韩av第一页| 亚洲欧洲精品一区二区三区波多野1战4 | 欧美激情视频三区| 国产肥臀一区二区福利视频| 久久久久久久久久久免费视频| 国产成人艳妇aa视频在线| 国产精品美女久久久久av超清| 午夜一区二区三区| 国产肥臀一区二区福利视频| 亚洲精品一区二| 国产精品户外野外| 国产精品电影网| 久久99热这里只有精品国产| 国产精品偷伦一区二区| 欧美极品一区| 日本午夜人人精品| 国产精品欧美日韩久久| 国产精品国语对白| 欧美激情亚洲自拍| 国产日本在线播放| 久久精品人人爽| 日韩一级特黄毛片| 亚洲精品国产suv一区88| 欧美日韩性生活片| 日本电影一区二区三区| 久久免费少妇高潮久久精品99| 国产成人精品av在线| y97精品国产97久久久久久| 亚洲mm色国产网站| 九九精品视频在线观看| 国产日韩一区二区在线| 一区二区三区四区国产| 国产欧美一区二区三区在线| 久久精品国产精品亚洲色婷婷| 国产精品美女www| 国内一区在线| 水蜜桃亚洲一二三四在线| 色综合电影网| 欧美最猛性xxxxx亚洲精品| 日韩在线视频观看正片免费网站| 午夜在线视频免费观看| 91久久精品国产| 日韩一区免费观看| 国产精品亚洲a| 国产精品国模大尺度私拍| 国产剧情日韩欧美| 99久久99| 久久精品国产69国产精品亚洲| 人妻夜夜添夜夜无码av| 欧美亚洲成人网| www.av毛片| 欧美成人免费一级人片100| 国产一区二区片| 精品一区国产| 国产精品久久久一区| 五码日韩精品一区二区三区视频| 国产精品美女在线播放| 蜜臀av性久久久久蜜臀av| 国产av不卡一区二区| 91av在线不卡| 亚洲一区亚洲二区亚洲三区| 久久久久久久久久久久av| 欧美一区三区二区在线观看| 亚洲免费视频一区| 91九色在线免费视频| 国产一区二区丝袜高跟鞋图片| 国产精品视频一二三四区| 欧美亚洲另类在线| 国产素人在线观看| 国产日韩三区| xxxx性欧美| 亚洲午夜精品久久久中文影院av| 久久综合伊人77777尤物| 97久久久免费福利网址| 日韩精品极品视频在线观看免费| 国产精品久久久久秋霞鲁丝| 国产精品美女主播在线观看纯欲| 欧洲精品久久久| 99久久精品免费看国产一区二区三区| 欧美激情中文字幕在线| 男人天堂新网址| 久久久久久艹| 中国人体摄影一区二区三区| 久久综合国产精品台湾中文娱乐网| 久久中文久久字幕| 国语自产精品视频在线看一大j8| 日韩美女中文字幕| 久久精品视频91| 欧美日韩成人一区二区三区| 久久综合久久综合这里只有精品| 成人精品久久一区二区三区| 蜜臀久久99精品久久久酒店新书| 国产性生活免费视频| 久久精品久久精品国产大片| 日日摸日日碰夜夜爽无码| 国产99午夜精品一区二区三区| 欧美在线视频一二三| 国内揄拍国内精品少妇国语| 欧美日韩无遮挡| 91高清免费视频| 风间由美久久久| 日本成人黄色免费看| 欧美精品卡一卡二| 久久久久久久久一区二区| 国产精品日韩av| 91精品久久久久久蜜桃| 久久精品国亚洲| 精品人妻少妇一区二区| 国产伦精品一区二区三区在线| 国产精品国模大尺度私拍| 精品蜜桃一区二区三区 | 欧美激情网友自拍| 日韩在线免费视频| 国产成人欧美在线观看| 国产精品偷伦免费视频观看的| 日本a在线免费观看| 国产精品露出视频| 国产精品乱码| 亚洲第一精品区| 国产精品人人做人人爽| 日韩欧美一区二区三区四区五区| 日韩中文字幕网站| 国产成人亚洲综合青青| 国产精品一区二区免费| 亚洲永久激情精品| 国产女人水真多18毛片18精品| 日韩精品一区二区三区久久| 免费久久久久久| 色999五月色| 亚洲自拍欧美另类| 亚洲高潮无码久久| 国产亚洲一区二区三区在线播放| 一区二区三区欧美在线| 97久久精品在线| 青青青免费在线| 国产综合色香蕉精品| 日韩一中文字幕| 久久99亚洲精品| 精品无码一区二区三区爱欲| 久操成人在线视频| 亚洲 国产 日韩 综合一区| 91精品视频免费观看| 欧美日韩国产精品激情在线播放| 欧美,日韩,国产在线| 国产成人在线视频| 国产精品99免视看9| 黄网站欧美内射| 爱福利视频一区二区| 免费日韩中文字幕| 国产乱码精品一区二区三区卡 | 日本一级淫片演员| 久久精品99无色码中文字幕| 色综合影院在线观看| 久久久视频在线| 色婷婷久久一区二区| 久久视频在线观看中文字幕| 国产精选一区二区| 日本免费在线精品| 国产噜噜噜噜噜久久久久久久久| 国内外免费激情视频| 日韩精品手机在线观看| 亚洲图色在线| 欧美亚洲免费在线| 久久大片网站| 欧美少妇在线观看| 国产精品三级美女白浆呻吟| 国产精品一区二区你懂得| 色综合av综合无码综合网站| 国产精品福利片|