Using the Xeon Phi: Difference between revisions

From MRC Centre for Outbreak Analysis and Modelling
Jump to navigation Jump to search
(Created page with "The Xeon Phi is essentially linux installed on a chip, on a card, inside some of our nodes. It's '''extremely''' multi-core, so offers itself for algorithms that have reasonab...")
 
No edit summary
Line 9: Line 9:
* In Project Preferences:-
* In Project Preferences:-
** Linker, General [Intel C++], Additional Options for MIC Offload Linker, add '''-no-fortlib'''
** Linker, General [Intel C++], Additional Options for MIC Offload Linker, add '''-no-fortlib'''
** (The above is a workaround - the compiler assumes everyone has fortran installed and cries if we don't.)
** C/C++, Code Generation [Intel C++], Enable OpenMP Offloading Compilation... - choose '''Intel MIC Architecture'''
** C/C++, Code Generation [Intel C++], Enable OpenMP Offloading Compilation... - choose '''Intel MIC Architecture'''
** C/C++, Language [Intel C++], OpenMP Support: '''Generate Parallel Code'''
** C/C++, Language [Intel C++], OpenMP Support: '''Generate Parallel Code'''
Line 48: Line 49:
** Make a folder called '''lib''' in your test folder (ie, T:\Wes\Phi\Lib), and copy all the files you just found into it, including the "locale" folder. If you like command-line copying, then something like <pre>xcopy *.* /e T:\Wes\Phi\Lib</pre>
** Make a folder called '''lib''' in your test folder (ie, T:\Wes\Phi\Lib), and copy all the files you just found into it, including the "locale" folder. If you like command-line copying, then something like <pre>xcopy *.* /e T:\Wes\Phi\Lib</pre>
==== A batch file to run the job ====
==== A batch file to run the job ====
I'll call this run.bat, and put it in T:\Wes\Phi.
I'll call this run.bat, and put it in T:\Wes\Phi. I'll assume we'll have the working directory set, so...
<pre>
<pre>
set MIC_LD_LIBRARY=\\fi--didef2\Tmp\Wes\Phi\lib
set MIC_LD_LIBRARY=\\fi--didef2\Tmp\Wes\Phi\lib
\\fi--didef2\Tmp\Wes\Phi\PhiTest.exe
PhiTest.exe
</pre>
</pre>


Line 64: Line 65:
=== And the result ===
=== And the result ===


In out.txt, we get...
In my out.txt, I have...


<pre>
<pre>
\\fi--didef2\Tmp\Wes\Phi2>set MIC_LD_LIBRARY_PATH=\\fi--didef2\Tmp\Wes\Phi\lib  
\\fi--didef2\Tmp\Wes\Phi>set MIC_LD_LIBRARY_PATH=\\fi--didef2\Tmp\Wes\Phi\lib  


\\fi--didef2\Tmp\Wes\Phi2>\\fi--didef2\Tmp\Wes\Phi2\PhiTest.exe
\\fi--didef2\Tmp\Wes\Phi>\\fi--didef2\Tmp\Wes\Phi\PhiTest.exe
Hello World from thread 194
Hello World from thread 194
Hello World from thread 182
Hello World from thread 182
Hello World from thread 114
Hello World from thread 114
Hello World from thread 176
Hello World from thread 176
Hello World from thread 42
Hello World from thread 37
Hello World from thread 231
Hello World from thread 74
Hello World from thread 39
Hello World from thread 134
....
....
Hello World from thread 18
Hello World from thread 18

Revision as of 17:27, 11 December 2015

The Xeon Phi is essentially linux installed on a chip, on a card, inside some of our nodes. It's extremely multi-core, so offers itself for algorithms that have reasonably simple but very parallel sections. Here is the experience I've collected so far in using it.

Using Visual Studio with Intel C++ Parallel Compiler

Setup the Project

  • Create a new 64-bit project.
  • Get into Release, x64 mode with the menus at the top.
  • Choose the Intel Compiler, from the Projects menu.
  • In Project Preferences:-
    • Linker, General [Intel C++], Additional Options for MIC Offload Linker, add -no-fortlib
    • (The above is a workaround - the compiler assumes everyone has fortran installed and cries if we don't.)
    • C/C++, Code Generation [Intel C++], Enable OpenMP Offloading Compilation... - choose Intel MIC Architecture
    • C/C++, Language [Intel C++], OpenMP Support: Generate Parallel Code
    • C/C++, Language, Runtime Library: Multi-threaded (/MT) - for all the good it does. We'll copy DLLs later.

Some Code

#include <omp.h>
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]) {
  
  int th_id, nthreads;
  
  #pragma offload target(mic)
  #pragma omp parallel private (th_id)
  {
    th_id = omp_get_thread_num();
    printf("Offload thread %d\n", th_id);
    #pragma omp barrier
    #pragma omp single
    printf("There are %d threads\n", omp_get_num_threads());
  }
  return EXIT_SUCCESS;
}

And compile it. You might get some "warning #3335: *MIC* offload features on this platform currently require that RTTI be disabled" - don't worry about them.

Prepare a cluster job

  • Decide where to run the job as normal. I'm going for T:\Wes\Phi (which is \\fi--didef2\Tmp\Wes\Phi) in this example.
  • Copy the executable there, which will be in Release\x64 in your project folder. Mine is called PhiTest.exe.
  • Find a folder something like: C:\Program Files (x86)\IntelSWTools\parallel_studio_xe_2016.1.051\compilers_and_libraries_2016\windows\redist\intel64_win\compiler
    • Copy cilkrts20.dll, libiomp5md.dll and liboffload.dll to your run folder (T:\Wes\Phi for me)
  • We also need to copy the library for the Phi itself to have. Find a folder something like: C:\Program Files (x86)\IntelSWTools\parallel_studio_xe_2016.1.051\compilers_and_libraries_2016\windows\compiler\lib\mic
    • Make a folder called lib in your test folder (ie, T:\Wes\Phi\Lib), and copy all the files you just found into it, including the "locale" folder. If you like command-line copying, then something like
      xcopy *.* /e T:\Wes\Phi\Lib

A batch file to run the job

I'll call this run.bat, and put it in T:\Wes\Phi. I'll assume we'll have the working directory set, so...

set MIC_LD_LIBRARY=\\fi--didef2\Tmp\Wes\Phi\lib
PhiTest.exe

And my launch file

job submit /scheduler:fi--didemrchnb /numnodes:1 /singlenode:false /jobtemplate:Phi /workdir:\\fi--didef2\Tmp\Wes\Phi /stdout:out.txt /stderr:err.txt run.bat

So remember the /singlenode:false is the silly hack we have to do when we ask for a single, whole node.

And the result

In my out.txt, I have...

\\fi--didef2\Tmp\Wes\Phi>set MIC_LD_LIBRARY_PATH=\\fi--didef2\Tmp\Wes\Phi\lib 

\\fi--didef2\Tmp\Wes\Phi>\\fi--didef2\Tmp\Wes\Phi\PhiTest.exe
Hello World from thread 194
Hello World from thread 182
Hello World from thread 114
Hello World from thread 176
....
Hello World from thread 18
There are 240 threads

That's a lot of threads.