Snow! AS3 filters and performance tests

Posted on July 22, 2009

1


Inspired by the falling snow in the third epsode of Mushishi: “Tender horns” I decided that a snow generator might be a nice benchmarker for several items:

  1. Using filters
  2. Addressing a LOT of particles
  3. Exploring the boundries of AS3 performance and looping through arrays

Why snow?

Snow is reasonably limited in its behaviour and nice to look at. It falls down. It can be moved by wind. It can flutter when there is hardly no wind. It is white and has this semi-transparency to it.

Screenshots:

Source of inspration: Mushishi – episode 3 – snow fall

picture-67

Snow using Blendmode

picture-651

Snow using Blur filter

picture-661

I only loosely based my snowfall on the Mushishi thing. (Mainly the sheer amount.) The snow created with Blur (the third image) is somehow less convincing to me than using the “add” Blendmode as they lack transparency. Beautiful snow has its price however. With only 1000 particles the Blendmode snow already make Flash consume 100% processor power on my MackBook Pro.

My assumptions

I assumed that the Blending filters (“add” and “screen”) would have a better performance than the Blur filter and Alpha. In theory the Blend-filters are binary manipulations like “add” “OR” “XOR” “bit shift left” and “bit shift right”. These are instructions the processor (old school types) only needs one or two clock cycles to perform. Blur and transparency require “snapshots” of the underlying layers on which the blur and transparency effects are placed. Many more clock cycles to do that, you would think.

My goal

I strive to keep my flash sites below 20% processor usage when the user is doing nothing. When you add something extra like snow, 50% constant processor usage to run an effect like snow is already a lot.

What I did

I tried single and multiple “snow flakes” per particle, compared different settings in my particle count, and tried different approaces to animate the particles via code. For performance with the flash-render engine there is hardly no difference in 1000 individual snowflakes of a 1000 snowflakes in 250 particles. Both approaches make the render engine bog down.

The setup of the showflake engine:

  1. Each particle is pushed into an array
  2. There are four arrays for four layers of snow (more distant layers move slower and contain smaller particles)
  3. Updating particles is done by cycling through the arrays and adding values to the current X/Y coordinates

What I found

Processor usage ADD / BLUR – 3000 particles

  1. ADD: processor to 130%
  2. BLUR: processor to 75%

Checking X/Y position of each particle

To see whether the particles are out of sight I simply check if the Y-coordinate is larger than a specific number. In my case: 500. Then I subtract 700 to place the flakes “in the sky” again. A simple “perpetual fall” scenario.

  1. If/then: every cycle: Processor to 70%. Snow seems to fall less smooth. More “hiccups” in the animation
  2. If/then: only every 20th cycle: Processor to 75%. Snow seems to fall a bit smoother

Calling a function in the object versus updating via references in the “for” loop

When the snow falls, the X/Y coordinates of each particle is updated by adding a value to the current X/Y coordinates. When doing this in the “for/next” loop I choose stick my test to two things:

  1. Assign the reference of the object to a variable. Update the settings of the object via the object reference in the variable. The assumption is that this is cheaper than constantly referring to the object via pointers in the array.
  2. Call an “update” function in each particle. I assumed that calling a function would be more expensive than assigning the object reference from the Array to a variable. It turned out that calling the function and let the particle take care of itself is a bit cheaper.

Again: snow seems to be falling smoother when the second scenario is followed.

Many particles with one flake versus few particles with many flakes

For the Flash render engine, the number or size (I did not check that yet) of elements is one of the determing elements for performance.

I found no niticable performance difference by creating less particles with more flakes per particle.

No (blur) filter

Surprisingly, when I let the engine animate snowflakes without any filter, the processor use leaps to 100% compared to the 75% it takes when Blur is applied. Apparently Flash alreay performs some optimization on blurred objects. Very likely by rendering them as bitmaps.

Only when “Cache as bitmap” is switched on the “unfiltered” flakes are in the same usage-range as blurred flakes.

Cache as bitmap on/off

There is no measurabel difference between Chace as bitmap on/off.

Blur applied to all flakes via parent clip

When the blur is applied via the parent clip on all flakes, no gain is gotten. Apart from that the snow becomes one unclear blob.

Conclusions on blur and add-blendmode

Based on the above tests and finding and with no warrenties (as these tests were limited in their range)

  1. Blur and Alpha are way more cheap than Blendmode filters (for snowflakes)
  2. There is no significant loss when particles are asked to take care of their own state (by calling the “update” function of the particle)
  3. There is no significant loss when each particle has its own blur
  4. If/Then statements are costly. Although the processor usage seems to remain stable, the animation seems to be more sluggish
  5. Caching as bitmaps makes no difference for snow particles
Advertisements
Posted in: Experiments