An early focus-stacked image of mine using Adobe Photoshop™
Although conceptually simple, the practicalities of focus-stacking are intense and intricate; so I spent a fortnight looking into the technique in detail, in the hope that the notes arising will be of some interest to others...
What it is...
When a lens is focussed there is only one plane, at the set focus distance, which is precisely sharp. Some region around that plane (in front and behind it) will be 'acceptably sharp' – not truly in perfect focus, but close enough to satisfy the eye; which is of course, the depth-of-field.
In this example the point of critical focus is placed at the eye of the forward chess piece. Most of the face is 'acceptably sharp'. The base becomes somewhat soft, but that is fine as it is just the face where we want to direct attention.
The very tip of the nose is also slightly soft, and for this image that is quite unfortunate.
The rear chess piece is completely out of focus, but is still recognisable and adds good context to the image.
All in all, this is a classic macro image which makes good use of the appeal of a limited depth-of-field; albeit with a slight problem at the tip of the nose.
The softness of the nose could be addressed either by:
rotating the chess piece so that it was more 'square on' to the camera;
focussing further forward, at the mid-muzzel (rather than on the eye);
using a smaller aperture;
or moving the camera further away.
None of these solutions are quite right for the shot though.
Changing the aspect of the piece (so it is square on) fundamentally affects the nature of the composition; it just wouldn't be the same image, so that's hardly a solution at all!
Although depth-of-field is useful, it doesn't really 'fool' the eye; we can still tell exactly where the point of critical focus lies and it would just look weird to place it mid-muzzel so that the eye and nose were both 'acceptably sharp'. The whole concept of 'acceptably sharp' only holds true when 'critically sharp' is at the right place, not at some random point half-way down the snout! (This is exactly why relying on the Hyperfocal Distance can oft times result in disappointing images, but that's another story...).
Whilst small apertures deliver greater depth-of-field, they do so at the expense of overall sharpness – with small apertures (certainly f/16 or less) the plane of critical focus is softer than at wider apertures due to the effect of the diffraction that light suffers when passing through very small holes. And macro shots certainly want the sharpest details, that's kind of the whole point of looking really closely at things.
Moving further away would certainly have helped, but then the image would have needed a greater crop. I want to create the biggest possible image of the smallest possible detail, so moving further away is utterly counter to the whole point of the shot!
In the olden days we would see the softness at the tip of the nose as just being a part of the way of things; the physics of optics...
...until focus-stacking came along and we could composite an image from individual frames taken with different focus points.
A focus-stacked composite image from 48 individual shots
For this image focus was acquired at the very furthest point of the rear chess piece, ensuring the subject would have a pin-sharp outline. The camera was then moved 0.5mm further away without refocussing; which brought the plane of critical focus forwards by half a millimeter, and a second shot was taken. This process was repeated until the full depth of the subject (2.4cm) had been covered. 48 frames were taken with a thin sliver of the subject in focus in each frame:
4 separate frames showing the focus shift across the subject
But why so many frames? Our earlier image had almost a full chess piece in focus in a single shot.
Well, here we have two chess pieces so immediately that's twice the depth we are trying to capture. Also we are much closer to the subject here, with a magnification of 1x (life size) as opposed to about 1/3rd life size of the earlier shot. Depth-of-field reduces dramatically at high magnifications.
In an early trial I was shooting every millimetre, but in the results I could see areas of softness where a sharp image had simply not been captured. At 1x magnification f/11 would probably be fine at 1mm increments, but I feel f/8 is the optimal quality setting for my lens and I prefer to work with more shots than risk the overall clarity of the result.
How is it done?
The basic principles of creating a focus-stack are outlined above – but the practicalities are quite exacting. Much can be achieved with simple (or cheap) equipment, handled well; whereas nothing good will come from the most expensive equipment handled poorly. In fact the greatest success lies not in the equipment at all, but in the sympathetic and careful management of the shoot.
Phase 1: Shooting The Sequence
Set-up for a focus-stack shoot
Shot-to-shot consistency is critical. The camera is mounted on a sturdy tripod – for sturdy read heavy. Many tripods are designed to be as lightweight as possible, but for focus-stacking something older and heavier is quite ideal; which has the advantage of making them less desired and cheaper too. Here I am using an old video tripod, from the days when video cameras were bulky things. It doesn't have a quick-release plate, which is an advantage as quick-release plates can have a little joggle or play in them. Everything needs to be sturdy. I've had to extend the tripod slightly, so I have extended the outer, thickest legs. The inner legs of tripods are thinner (so that they can collapse into the outer legs) and they make the support less stable when extended – so if extending the tripod use the outer legs, not the innermost.
My studio is carpeted, giving a soft surface that allows the tripod to rock somewhat. Even a millimetre of rock is significant at high magnifications, so I have placed a 5mm plywood sheet between the tripod and the floor, which I have weighted down to offer greatest stability. All of this matters.
Example of a focus rail
Between the camera and the tripod a focus rail (or focus stage) is used – this is the device that allows the camera to be shuffled by minute degrees. This example cost about £30 and is mechanically very sound (in the main). One full turn of the (red) screws moves the camera 1mm, so it is quite finely tuned. It is a pain resetting the position after each sequence, but it is the cheapest possible solution (There are cheaper rails available, but the construction quality becomes very poor below this price level).
There is a small amount of play between the rail and it's foot, but I am aware of that and ensure, prior to taking each shot, that the rail has not swivelled by exerting a slight finger pressure on the left hand side. There is almost a sense of ritual as the sequence is executed: Adjust rail; Check alignment; remove hand; Wait; Fire shutter; Count aloud... I count aloud each position of the rail as the shutter is firing so if I need to take a momentary rest between shots I'm left in no doubt about how far I have progressed. Staring at tiny millimetre marks shot-after-shot is quite tiring for the eyes and a rest every dozen or so shots is inevitable. Also, tension can build up just by dint of concentration and sometimes I find my hands shaking as I attempt an adjustment. Time for a breather.
I always start a sequence of shots at the furthest distance from the subject (with the front most point of the subject in focus) and then edge the camera forwards during the sequence (finishing with the furthest point of the subject in focus). If I decide to do a second run through of the sequence I return the camera/rail to the starting point. It is tempting to execute a second sequence in reverse as that saves all the winding needed to get the camera back to the start BUT consistency is everything, so I shoot each sequence identically, front to back. This also helps in post-processing as I always know the relative order of the shots – later shots (higher frame numbers) are towards the back of the subject. Excellence is habit, according to Aristotle anyway.
Once a sequence is complete it's worth taking a dark-frame shot (lens cap on). It really helps in post as the dark thumbnails demarcate the start and end of a sequence.
Despite spending a small fortune on my camera, for this type of shooting I turn ALL of its automatic features OFF – because consistency is everything when creating a stack and I don't want the camera to suddenly decide to change something. So that's manual exposure, fixed set white balance, no auto-ISO, and (of course) manual focus. This means the camera settings will be identical for every shot. You might want to take a few pre-shots in automatic modes to determine what manual settings you want to use, but when it comes to shooting the sequence ensure everything is set to manual.
For dSLRs the shoot is best executed with the mirror locked-up out of place, to avoid repetitive vibrations from the mirror action. Live View mode is helpful as it reduces the amount the camera has to be touched, which reduces the likelihood of the camera being moved or knocked. Live View also typically locks the mirror up, which we definitely want!
Live View eats batteries though, and you do not want to have to change batteries halfway through a session when everything has been set-up and locked down into place (I can't get the battery out of my camera when it's mounted on my bulky tripod)! So ensure full charge before starting work. In fact have a spare battery on hand so yo can put a freshly charged one in the camera each time you remove it from the tripod to upload images so far (ie. Between whole sequences).
I'm using a high quality macro lens costing circa £900 new, but the optical design has hardly changed over the years so a second-hand manual lens is quite adequate and can be sourced for circa £100. This lens has built-in vibration reduction, which is not wanted when shooting high magnification on a tripod, so that also is turned OFF.
I shoot at f/8 as that gives a workable degree of depth-of-field whilst avoiding the problems that very wide (f/4 or more) and very small (f/16 or less) apertures can present. Note that as well as the problem of softness caused by diffraction, smaller apertures also show up sensor dirt very clearly, and with perhaps 50 shots in the stack cleaning all those images takes a lot of time.
Since I'm shooting still-life macro the shutter speed can be whatever is required, within reason. Long shutter speeds (1s or more) exacerbate noise. The examples here are all shot at around 1/100th of a second. Outside of the studio I will aim for something over 1/250th, in order to help arrest any movement caused by breezes. In that case I would likely use a higher ISO setting, rather than a wider aperture, to achieve a faster shutter speed – again within reason, and depending on the camera. I'm fairly comfortable all the way up to ISO1600 with my camera (although for stacking ISO100 is my definite preference).
For macro work of any kind (whether focus-stacking or not) minimising noise is critical, as any form of noise removal causes softening of the image. So that means low ISO, avoid long exposures, and expose to the right of the histogram (without burning out the highlights). Remember noise is most apparent in the dark tones, so the greater the exposure the cleaner will be the image.
If it is very bright I may stop down to f/11, but no further than that – a neutral density or a polarising filter is preferred to very small apertures.
To further avoid having to touch the camera, the shutter is triggered by a remote switch, which I picked up new for under a fiver. I wouldn't do any kind of stacking project without it. If you really can't get one for your camera then use the self-timer so there is a delay between pressing the shutter release and taking the shot. This allows any tremors arising from the force applied to the shutter release to settle down.
Because the subject is so very small I was able to light it with a pair of 6W domestic table lamps. As well as being extremely cheap these gave constant illumination from shot to shot (I had my blind down at the window so there was no fear of changing daylight effecting the exposure). Note that well diffused light is generally ideal for macro work. Strongly directional light casts harsh shadows that obscure the detail we are trying to reveal.
For the shoot itself I record RAW images. Normally I record RAW+JPG. The JPGs offer a kind of (last ditch) backup and are handy IF I want to share pictures quickly (which you know I might when away on holiday and such). BUT when taking 50+ shots for a single image, the JPGs just eat up time and storage space. Others elect to turn off RAW and shoot in JPG only, which I kind of get, but since I'm putting so much effort into creating a high quality image I just don't feel like I can give up the advantages of RAW files.
The only other practical issue was to be sure not to kick the base board (or the tripod) between shots when adjusting the focus rail.
Once the full sequence has been taken, and before the camera is moved, use the review function to step quickly through all of the images, almost as though they were playing as a video. You should see the subject gently drifting in the frame – each shot is taken from a different distance to the subject so the magnification and perspective changes throughout the stack which causes this drift. You should try to ensure there's enough space around the subject to allow for this drifting, in fact take pre-shots at both (near and far) focus extremes to check the framing and lighting is good throughout.
If you see the subject jumping about in the frame, in a jerky fashion, then the camera has been joggled during the shooting. Try to work out where the movement is coming from and how it can be controlled, then shoot the sequence again.
I've also included here a full video walkthrough of a focus-stacking session.
A summary of what you will need
I'm assuming you have a camera with some kind of macro, or close-up, facility – but over and above that in order to focus-stack macro images you will need:
Manual exposure mode
An aperture in the range f/5.6 to f/11
A shutter speed probably greater than 1/100th
A low ISO setting
A high exposure (expose to the right)
A constant White Balance setting
Vibration Reduction and auto-focus disabled
The mirror locked up (in a dSLR)
A Live View facility (that allows magnification for manual fine-scale focussing)
Possibly neutral density or polarising filters to help balance the exposure settings
A sturdy tripod (not over extended)
A focus rail.
A remote shutter release
And, if shooting in the wild, a windbreak of some kind!
Oh, and time, plenty of time. A studio shot of 50 or so frames can end up taking 2 hours with various retries. Shots in the wild need to be accomplished much faster as conditions change quite quickly, perhaps 20 minutes is enough but you need to know you have 20 minutes and your compatriots are not going to start getting bored waiting for you to finish.
A focus-stack in the wild
When shooting live insects in the wild I might elect to use the camera's intervalometer, setting it to take a shot every second and adjusting the focus rail between times. This means once everything is set-up a good stack can be captured in the order of 10s or so. This is ample for sedate insects, eg. at the very start or end of the day when they don't dash about eating and such quite so much.
Note though, I said sedate not sedated. Plenty of these astonishing extreme macro shots of flies and such are actually specimens that are either dead or else have been somewhat tortured by being artificially cooled in a refrigerator by the photographer. I don't know about you, but that's not the kind of photographer I want to be!
By the way, plenty of folk do focus-stacking completely hand-held, with no tripod or focus rail, and they seem very happy with the results. This is perfectly fine at the landscape (wide view) level, with 3 or so shots autofocused at different points in the scene. I'm less convinced it's a success oriented strategy at the macro level though.
Anyway, all of the foregoing should allow you to get a stack of images ready to be automagically transformed into a single, pin-sharp, image in post...
Phase 2: Creating the image
First off, it's perfectly possible to manually merge all of the frames into a single sharp image. Start with the rear most shot. Lay the next shot on top. Adjust the perspective so the subjects line up. Then erase all the bits of the upper layer that are out of focus. Repeat for each subsequent layer... Sounds straight forward, but it will take ages, days even. You'll probably want to work with no more than 3 to 5 images. That's ample for getting a landscape image sharp front to back. But for macro work, you will want to employ the software tools available.
With landscape based focus-stacks it is the proximity of the forward most object (that you want to appear sharp) that determines how many shots will be required – depth-of-field increases with distance. It may take 2 or 3 near field shots to cover the foreground, followed by perhaps one shot each for the mid and back grounds.
But, assuming we want to stack lots of super-small depth-of-field shots of a macro subject the workflow looks like:
Develop all of the RAW shots exactly equally
Create the stack
Manually adjust to manage 'stacking artefacts' (which will be apparent)
Step 1: Development
Developing all of the RAW files exactly equally is fairly straight forward. They can all be loaded in to the RAW development software (I use Adobe™ Camera Raw) together since the computer doesn't actually need to manipulate the full scale bitmap of every single frame. RAW Development is just a bunch of instructions that get applied (later) when the image is opened in photo-editing software. The RAW processor only needs to apply the instructions to the single frame that is being previewed. All of the other frames that are being simultaneously developed are only shown in thumbnail, and updating thumbnails doesn't take a whole load of memory or processing power. So load all of the frames, set one as the big preview, select all frames and apply the RAW development settings. I can happily process over 100 images simultaneously because only the image being previewed needs to be fully loaded into memory. There's no reason to baulk at loading multiple images into RAW Development software – give it a go.
Usually for RAW development I will concentrate on (ie. select for preview) something from the middle of the stack. Once I'm happy with the development I will check how the first and last images are looking (and tweak the development if needs be) before committing the settings. Once the settings are committed the software simply writes the appropriate sidecar files for every frame, which takes no real time as they're just lists of instructions.
This is the underlying nature of RAW Development, and really makes a nonsense of any arguments for shooting stacks as JPGs.
If working with an image editor that is native to the RAW Development software (eg. Photoshop with Adobe Camera RAW) I can then move onto the next step. Otherwise I need to export the developed files to something the other software tools can work with; typically TIFFs.
Step 2: Stacking
I took a look at 6 different applications that provide focus-stacking, to be honest mostly because I was pretty disappointed with my default workflow (Photoshop!). I guess it's okay for wide-field (low magnification) focus-stacking of landscapes with just a few images – but working with 50 shots at life size magnification was a real trial.
Photoshop loads every image as an independent layer, merging them with very precise masks. This approach has two massive downfalls. Firstly, you need to be able to load every single layer. That was over 2GB of image data. With 128GB of RAM and 16 processor cores on 2 Intel Xeons, that was actually a breeze. But the vast majority of machines photographers are using just couldn't do this. Secondly, once the stack was complete and I wanted to make adjustments things got super tricksy. I had to find the layer that was 'soft' and then remove the parts of the mask that was letting the softness through. This left an empty gap in the image! So then I had to find a layer that had the same area, but sharper, and paint that layer's mask in... which is basically double the edits.
So I took a look at what Affinity Photo could do. I like Affinity Photo. I bought it for the day when I can no longer stomach Adobe's licensing model (that day sure is coming), so this was a good chance to put it through its paces. Affinity takes each image in turn and extracts what is needed into a composite image window – you can see the image building up as it goes! This avoids needing to load a full 2GB of image data all at once BUT you end up with a single, flat, composite image layer. I couldn't see any opportunity to influence exactly which parts of which frames would contribute to the result. You get what you get. You can of course dodge and burn that layer, or use any other tools to try and correct issues, but without reference to all of the original frames the quality of the result is only going to go down from there.
These monolithic purpose-built image editing applications really only offer a very basic starting point for anyone getting into focus-stacking at the macro level (fine for landscapes though). So I took a look at alternatives that were geared specifically towards the task of focus-stacking. They'd be no-good for finalising the image in terms of signature style and atmospherics, but can perhaps create a great, intermediate result.
Helicon and Zerene seem to be the most fated of the current offerings and I have to say both immediately outshone 'the monoliths'. Both seem to work on a file-by-file basis, avoiding the performance issues of Photoshop (ie. will work on much lower powered machines). The difference to Affinity is that these applications have a user interface designed for the specific job at hand! They allow the stacked result to be viewed side-by-side with any of the individual frames that have been used, and thereby allow adjustments to the contribution that the frame is making to the result. All of a sudden the control of the resulting image sits where it belongs, in the photographer's hands, not just some algorithm. Technology as aid, not creator. I like these. They both cost though with pricing ranging from ~$30 to ~$200, depending on level of functionality (and I need the Pro version, of course) and whether you want to pay them every year or just the once (at a premium cost!).
To my mind though, Zerene is worth £150 for its Prosumer edition. Plus its an investment with a person (a photographer in fact) rather than a corporation.
But I also wanted to see if it was possible to achieve advanced focus-stacked images without incurring yet another software cost. CombineZP gets lots of mentions, but when I downloaded and installed it on my Windows 10 system it just wouldn't run immediately. I thought that even if I got it running it isn't great for other users if just getting started is a major chew-on. It's also reported as being a bit long in the tooth, so I moved on. Next I tried ImageJ. It installed no problem but with an utterly confusing interface. However there are really excellent help forums (if a little pedantic) from which I discovered the focus-stack plug-in is also required. Again that loaded no problem and I was able to make a stack. The results were somewhere between Affinity and Helicon. Not really good enough for me, but if you don't want to spend anything I think it's a fair contender. A couple of others I didn't look at, but which may be worth trying out are Hugin Tools and Picolay.
So lets look a little closer at some of the results:
Comparative 'out-of-the-box' stacking results
For the above I used the same set of RAW Developed TIFF files in each application and ran the 'out-of-the-box' stacking command – ie. without messing around with any of the fine-tuning controls. That said, Photoshop and Affinity don't offer any fine-tuning controls! Which means it's quite possible that Helicon and Zerene could do a better job with more practice. However, even with this simple straight out-of-the-box exercise Helicon and Zerene both deliver significantly superior results with less of an apparent halo around the frontmost knight. They also create much better definition around the lower jaw of the frontmost knight. Plus they have much better interfaces for patching such issues up after the automated algorithm has done its work.
So with focus-stacking it's worth introducing an extra software application to the workflow to achieve the stack before reverting to the more general photo-editing applications. I'll be using Zerene going forward but I expect Helicon and ImageJ (for a totally free alternative) to be quite workable also.
But you might wonder “Why's it so difficult? Where do these haloes come from?”
The Focus Shadow
Any and every focus-stack will deliver a degree of halo around sharp forward subjects that sit in front of equally sharp backward subjects.
By splitting the stack apart we can clearly see where the haloes come from.
Showing haloes caused in this example by forward layers including too much
In this example we can see that the forward layers have included blurry background detail from beyond the sharp edge of the muzzel.
In creating the stack we start with a series of shots that capture the foreground subject in sharp detail. Each of these have a blurry background.
By the time we have finished the sequence we also have a series of shots that capture the background subject in sharp focus. Each of those will have a blurry foreground (of the frontmost knight). The part of the background subject that is obscurred by the blur of the foreground subject can never be captured in perfect sharp detail - because the camera cannot see perfectly through the blurred foreground.
When we pull the stack together most of the blurred foreground subject we see in the sharp background shots gets overlayed with the sharp foreground shots. Most of it, but not all of it. The sharp foreground image is smaller than its blurry counterpart. That's how focus works. Out of focus things are more spread-out than in focus things.
A single frame from towards the back of the stack
So there is a region around the sharp foreground subject that either contains the blurry background we had when taking the foreground shots OR contains the blurry foreground we had when taking the background shots. Its the physics of optics.
In this case the algorithm has decided that the blurry background captured with a sharp image of the near knight is sharper than the blurry foreground captured with a sharp image of the far knight... thus we see the halo around the muzzel in the forward layers of the stack.
In this case taking more detail from the back than from the front improves the image, and indeed that is what Helicon and Zerene achieved; although some further tweaks (burn, clone) were needed in the image editing software to create the final image, because none of the shots in this region are perfect.
There are number of things that can be done to help alleviate this inherent difficulty with deep focus-stacks:
Keep the background simple, and far enough away such that it is equally blurry (more or less) whether shooting towards the front or the rear of the stack.
Choose a composition (as far as possible, without compromising the story they image tells) that minimises such overlaps of separated subject planes.
Minimise the depth of separation between subject planes that overlap (which will reduce the size of the halo effect)
Other Stuff To Watch For
The earlier videos demonstrated what smooth and jerky focus-stacks look like, indicating that bad (jerky) stacks were caused by fine movements in the set-up during the shoot. This is true when shooting at a fixed focus setting (by moving the camera). If the stack is made by keeping the camera stationary and changing the focus setting of the lens there can be another cause of 'jerkiness' in the stack: focus breathing.
Put simply, focussing a lens changes its focal length (to an extent), which changes its perspective of the scene to a much greater extent than simply moving the camera. It's not an effect that is easily noticeable when shooting a wide scene (at a distance) but which can be quite dramatic at high magnification. Focus breathing won't be much of a problem to the best stacking software, but it just adds one more level of manipulations required – which is why I choose not to refocus the lens when shooting stacks.
Small dark lines in the final shot may occur if there is significant sensor dirt. All of the images are adjusted for perspective as part of the stacking process. This adjustment can shuffle a dirt spot that appears in every frame such that they end up becoming not a spot, but a line. Clean everything before commencing a stack. Avoid small apertures that exacerbate the appearance of sensor dirt. Clean the individual frames after shooting and before stacking.
Backgrounds sometimes show through foregrounds. Slices of the background may fall into the plane of sharp focus for a given shot such that it takes precedence over later sharp foreground shots. If a stack contains regions of equal foreground and background sharpness the algorithm may not make the right choice. There needs to be flexibility in the software to override the algorithmic blending of the frames.
Focus bands. If there are soft patches in the final image then most likely the steps between shots have been too great and no truly sharp frame exists for the given area. Take more shots with smaller steps between them.
There's a whole lot more to learn about focus-stacking I think. Not so much in the fundamentals, but rather the practical application; like everything in photography it's a doing thing. So I'm off to shoot more stacks now. If you want to read more on the subject (but really, just shoot some stacks) then I can well recommend extreme-macro.co.uk