2016-03 Fast strong hash functions: SipHash/HighwayHash (Github)
High-speed hashes with thorough mixing and near-cryptographic strength.
Provides SipHash and a tree version (1.5x and 4.2x speedup) plus
"HighwayHash" (10x speedup).
2011-10 Efficient Algorithms for Large-Scale Image Analysis
thesis demonstrating the feasibility of analyzing gigapixel images
within minutes on a single workstation. Introduces seven new algorithms
for various stages of the analysis pipeline that outperfom
previous techniques by factors of 10-100 while maintaining output quality.
2011-09 Engineering the Ideal Gigapixel Image Viewer
Smooth pan and zoom in gigapixel images via lossless compression, asynchronous I/O and shaders.
2011-08-31 Lossless asymmetric single instruction multiple data codec
Novel SIMD predictor and entropy coder: 50% compression and 3 GB/s (per core) decompression.
2011-08 Engineering a Multi-Core Radix Sort
Expanded version published at EuroPar 2011; 10% speedup vs. 2010-08-17 technical report.
2010-09 Highly optimized weighted-IHS pan sharpening with edge-preserving denoising
Pan sharpening: fast (100 MPixel/s) and high-quality (reduced noise, adaptive weights).
2010-09 Fast, High-Quality Line Antialiasing by Prefiltering with an Optimal Cubic Polynomial
Software line rasterizer with optimal low-pass filter; outperforms mid-range GPUs.
2010-09 Highly Efficient Screening for Point-Like Targets via Concentric Shells
Asymptotically optimal pipelined divide and conquer algorithm for finding point-like objects.
2010-08-17 Faster Radix Sort via Virtual Memory and Write-Combining
Sort throughput > 88% of memory bandwidth (1.24x speedup vs. a Fermi GPU).
Efficient Parallel Algorithm for Graph-Based Image Segmentation
Fast but high-quality image segmentation, made possible by a new
parallel algorithm that
doesn't just chop images into tiles.
Additional Information: Paper,
2008-02 Determination of Maximally Stable Extremal Regions in Large Images
Efficient algorithm for extracting MSERs (e.g. for image segmentation).
Pitfalls and Solutions
Describes PC timing hardware, their pitfalls concerning reliable,
high-resolution timing, and a solution.
Gebäudemodellierung aus Laserscanning-Daten
Diploma thesis: an algorithm for automatic building reconstruction from
Additional Information: Presentation
2006-03-26 Speeding up
A drop-in replacement for VC7.1's memcpy that is 3.5 times as
fast on an Athlon XP.
File Accesses via Ordering and Caching
Study thesis: how to speed up file loading by a factor of 10.
to Program Optimization
A quick rundown on optimizing for size and speed.