Console to Chrome
HTML5 and JS
for game developers
What's this hour about?
You
"Get" HTML5, game dev
Want to know how to actually do it
*this
HTML5 APIs and quirks
V8 and fast Javascript
Chrome rendering and internals
"Chrome Games"
Two divergent camps
Ubiquitous code
Awesome + browser
*this: awesome games (first, ubiquity later)
Credible, professional games
Linkable, frictionless distribution
Not for every browser or device
Building Blocks
HTML5 APIs for Games
WebGL
It's like OpenGL but...
OpenGL ES 2.0 != OpenGL
No fixed function
readPixels, all get() calls, glFinish extra expensive
The performance is pretty great
And the
spec is growing fast!
Audio
Chrome only (for now)
Accurate scheduling
Graph based
Nodes, filters
Nothing works everywhere
Fall back for other browsers
And so much more...
Javascript
and how to make it fast
I'm talking JS in Chrome
Hello JS!
Javascript
Typeless
JIT compiled
Garbage collected
Single threaded
^_^ robust
>..< lawless
Large-scale JS
Coding standards for sanity
Defensive coding
Layers of complexity under your app
Modular FTW
Able to restart and reconnect in pieces
High-performance JS
JS can be amazingly fast but...
Easy to write slow code
Small changes, big consequences
Difficult to tell what's going on
VM is a moving target
How does JS work?
JIT compiler
Parses your JS
Generates snippets fo native code
Fast to compile
Fully general output
Numbers
Small ints (SMIs)
Immediate, fast
31 bits on 32 bit machines OR 32 bits on x64
64-bit "heap numbers"
Won't fit in an SMI and is not local
Wrapped and heap allocated
Slightly slower
Doubles may be optimized, but no guarantees
Arrays
Uint32Array, Float64Array, etc.
Memory efficient, no boxing
JS Arrays
API allows operations not possible in C
Backing storage: sparse vs. dense
C-like array OR hash table ("dictionary mode")
Many factors switch backing, e.g. space efficiency
Bad idea
var a = new Array();
a[1000] = 8;
Doesn't make sense in C
Indexing OOB will trigger dictionary mode
Good idea
var a = new Array(1000);
a[0] = 13;
a[100] = 21;
Allocates 1000 entries up front
Indexing sparsely is OK
Backed by contiguous store of length 1000
Objects in JS
Objects are associative arrays
Key value property pairs
Properties can change dynamically
Prototype chains can change
You can make every single object unique
...but don't do that
Objects in V8
Large systems have structured data
Fast property lookup is critical
Hidden classes
Group objects with same structure
Shared across objects
Expensive to generate once, cheap afterwards
Property inline caching
Check hidden class on property lookup
First time fully generic lookup
Remember where you found the property
Generate new optimized code
Next time, direct access
Bad idea
function Vec2(x, y) {
this.x = x;
this.y = y;
}
var v0 = new Vec2(5, 8);
v0.z = 34;
Adding properties changes the hidden class
You pay for that
Object properties storage
^_^ directly on object
^_^ array
>..< hash table "dictionary mode"
What triggers dictionary mode?
Too many properties
Change property attributes
Delete a property
Heavier operations than the JIT compiler
Warm up on fully-general path
Profile for hot functions
Non-deterministic sampling profiler
Mine types, specialize
New optimized code (inlining, licm, gvn)
Speculative optimization
What does it optimize?
Not all constructs are handled
"bailout" == tried to optimize but quit
Function too long
tryCatch, ForIn, NonStringToString, etc.
V8 --trace-bailout
V8 --trace-opt
Deoptimization
But those optimizations were speculative
"deopt" == assumptions violated
Swap back to the slow general case
...and now you're running slow code
V8 --trace-deopt
Javascriptisms
so elegant, so slow
Image credit: http://goo.gl/w5ovo
Summary so far
Create a few well-defined object types
Don't change properties dynamically
Keep object property count under 30
Feed functions consistent data
Don't write enormous functions
Static and C-like is often fast
Garbage collection
Common first problem for large systems
Two generations
Young: small frequently collected space
Old: longer-lived data
V8 --trace-gc
Play nice with the GC
Promotion is expensive
Want very long or very short lived objects
Release your references
Execution contexts can hold onto references
Closures can hold onto references
No littering
Avoid GC stalls
Most things are objects in JS
Temp variables
Values returned from functions
Closures
Use SMIs, scratchpads, update in place
Bad idea
function add(vecA, vecB) {
return new Vector(
vecA.x + vecB.x,
vecA.y + vecB.y,
vecA.z + vecB.z,
);
}
Allocates a new object for every vector add
Yikes!
Better idea
function addTo(vecA, vecB) {
vecA.x = vecA.x + vecB.x;
vecA.y = vecA.y + vecB.y;
vecA.z = vecA.z + vecB.z;
}
Updates vecA in place
Uglier, but much friendlier to the GC
Life in Chrome
Being in a browser
Uncertainty everywhere
Browser capabilities
vary
Local hardware performance
Your game lives in a tab
Tab can close at any time
Other apps in your thread
Additional compositing and rendering
Local hardware
You know nothing about the local environment
Games vs. the web: mismatched expectations
Graceful degradation
Micro benchmarks
Run tests while loading
Collect data while running
Collect data, communicate, set expectations
More complexity, please
Hello command buffers!
A funny thing happened on the way to the GPU...
Buffers are shared and limited in size
Textures, arraybuffers, commands
Don't spill the buffer
Spill == flush
Flush == stall
Limit upload size per frame
Compatibility headache
Drivers
Test by creating a context
DirectX 9 on Windows
Software rasterizer
Rendering loop
setInterval, setTimeout >..<
To the browser, it's just another event, not "animation"
Runs even when backgrounded or overloaded
Tries to call at 60Hz, adapts to load
Manually skip frames if necessary
Feeds the pipeline at a consistent rate
JS, GPU work stay in sync
The life of a frame
RAF called first each frame
Some frame budget goes to Chrome
JS causes draw calls to queue up
User input
Input blocking
User input blocked while main thread is busy
Queuing delays
RAF vs. user input
All work should be early and contiguous
RAF allows Chrome to pack work into frames
RAF vs. user input
Events can cause work "whenever"
Timers, DOM input handlers
Non-contiguous work
This frame will be late
Buffer your inputs and handle in RAF
Summary so far
Throttle data upload per frame
Budget 2-4 MS / frame for Chrome
Queue user inputs to handle in RAF
Move work off the main thread
Be aware Chrome's rendering cycle
Developer Tools
Chrome flags
about:flags
about:memory
about:gpu
Chrome developer tools
about:tracing
about:tracing
console.time("tag");
console.timeEnd("tag")
V8 flags
V8 --trace-deopt, etc.
Mac:
/Applications/Google\ Chrome.app/Contents/MacOS/
Google\ Chrome --js-flags="--trace-bailout"
Windows:
output goes to Windows Debug (NOT stdout)
useful
flag
documentation
Chrome <3's YOU
People WILL listen
It's NOT a stupid question
That bug HASN'T already been reported
Thanks!