Overview:

- Multicore to manycore
- Parallel Patterns, aka 13 Dwarfs
- Compelling applications for handhelds and Laptops

**SIGPLAN**

Keutzer:

- Dataflow: pipe / filter
- Semantics of parallel engineering / architecture
- Verification by approximate equality to a serial version

Sen:

- Verification by exhaustive testing / proof
- Combinatorial explosion
- Needs real-time verification (incorporates time as a variable)

**SIGAPP**

Keveany:

- CT Scans -- dense mesh
- Feature extraction (bone, blood vessel) results in sparse mesh
- Unstructured grid
- Graph traversal
- Dense and sparse LA
- Goal: analysis to run on physicians' laptop, analysis and prediction in the field

Morgan:

- Speech recognition goals
- 100% identification
- Real-time
- State of art
- 10-30x slower than real-time
- 50% incorrect in moderate noise environment
- Bound by Amhdals' Law
- Basic block diagram:
- Signal processing frontend
- MFCC
- PLP (Perceptual Linear Prediction)
- Feature extraction / probability estimates
- Time-synchronous viterbi algorithm against phonemes / lexicon
- Parallel goals
- Multiple sigprocessing frontends (128 in parallel)
- Modulation spectrum analysis

Wessel:

- Hearing augmentation
- Noise reduction
- Scene segregation
- Music enhancement
- Speech enhancement
- Hearing restoration / normalization

**LUNCH**

- Intel 80-core
- 8k RAM per core
- Algorithms implemented
- PDE Solver
- 2D FFT
- Ptolemy Project
[http://ptolemy.eecs.berkeley.edu]
- Explicitly deterministic parallel programming
- Precision semantics of dataflow
- Code generation (C)
- New directions in real-time constraints / verifications

**SIGSOFT**

Yelick:

- Efficiency Layer
- For experts
- Fine-grain control over parallel execution modes
- Productivity Layer
- Language for coordination / composition
- Interface specifications allows more dynamic scheduling
- Test-based specification defining (in)dependencies etc
- Static analysis technique to prove independence
- Big debate about terminology "frameworks" vs "libraries", drivers etc
- Directors
- Define semantics
- Scheduling
- Can be runtime

Discussion:

- MapReduce is the best example
See: [google:sawzall mapreduce]

Jones:

- Web 2.0 on embedded platform
- UI concerns
- etc
- Browser is new OS
- Handheld is new desktop
- Need multicore to parse big pages on low power
- Parsing is easily done in parallel
- Sketching
- Fill in unknowns to implement spec
e.g. complex loop constructs for optimization
- SkipJax / FlapJax
- AJAX language
- Dataflow language, event-streams map onto document properties

**BREAK**

**SIGOPS**

Kubi:

- Secure but non-invasive support for Parallel Software on Manycore Arch.
- OS features
- Resource allocation
- Device virtualization (most problems are device drivers)
- Common facilities
- Security
- OS design
- Tradeoff of security vs performance
- New directions
- Resources not limited
- Protection domains are not heavyweight
- Cost of messaging is low
- Memory not limited (some violently disagree)
- Hiearchical FS is outdated
- Approaches
- Spatial partitioning of processors, unrestricted message passing in each partition
- Minimalism; hypervisor resident on each partition
- Fine-grained protection
- Fault isolation

**SIGARCH**

Krste:

- Platforms
- General purpose apps
- App-specific
- Why manycore
- Power
- Verification
- Redundancy
- Problems
- Glue
- Sequential
- Spatial partitions
Tiles, gangs
Message passing
Management
- Restricted to Power-of-2 on each dimension of core array
- Atomicity
- I/O channels
- Probably in the form of high-speed serial links
- Network links etc