ZFS (Part 1)

Over the last year I was getting more and more curious/excited about OpenSolaris. Specifically I got interested in ZFS – Sun’s new filesystem/volume manager.

So I finally got my act together and gave it a whirl.

Test system: Pentium 4, 3.0Ghz in an MSI P4N SLI motherboard. Three ATA Seagate ST3300831A hard drives, one Maxtor 6L300R0 ATA drive (all are nominally 300 gigs – see previous post on slight capacity differences). One Western Digital WDC WD800JD-60LU SATA 80 gig hard drive. Solaris Express Community Release (SXCR) 51.

Originally I started this project running SXCR 41, but back then I only had 3 300 gig drives, and that was interfering with my plans for RAID 5 greatness. In the end the wait was worth it, as ZFS got revved since.

A bit about MSI motherboard. I like it. For a PC system I like it alot. It has two PCI slots, two full length PCI E slots (16x), and one PCIE 1x slot. Technically it supports SLI with two ATI Cross-Fire or Nvidea SLI capable cards, however in that case both full length slots will run at 8x. Single slot will run at 16x. Two dual channel IDE connectors, four SATA connectors, built in high end audio with SPDIF, built in GigE NIC based on Marvell chipset/PHY, serial, parallel, built in IEEE1394 (iLink/Firewire) with 3 ports (one on the back of the board, two more can be brought out). Plenty of USB 2.0 connectors (4 brought out on the back of the board, 6 more can be brought out from conector banks on the motherboard). Overall, pretty shiny.

My setup consists of four IDE hard drives on the IDE bus, and an 80 gig WD on SATA bus for the OS. Motherboard BIOS allowed me to specify that I want to boot from the SATA drive first, so I took advantage of the offer.

Installation of SXCR was from IDE DVD (a pair of hard drives was unplugged for the time).
SXCR recognized pretty much everything in the system, except built in Marvell Gig E nic. Shit happens, I tossed in a PCI 3Com 3c509C NIC that I had kicking around, and restarted. There was a bit of a hold up with SATA drive – Solaris didn’t recognize it, and wanted the geometry, number of heads and number of clusters so that it could create an apropriate volume label. Luckily WD made identical drive but in IDE configuration, for which it actually provided the heads/custers/sectors information, so I plugged those numbers in, and format and fdisk cheered up.

Other then that, normal Solaris install. I did console/text install just because I am alot more familiar with them, however Radeon Sapphire X550 PCIE video card was recognized, and system happily boots into OpenWindows/CDE if you want it to.

So I proceeded to create a ZFS pool.
First thing I wanted to check is how portable ZFS is. Specifically, Sun claims that it’s endinanness neutral (ie I can connect the same drives to the little endian PC, or big endian SPARC system, and as long as both run OS that recognizes ZFS, things will work). I wondered how it deals with device numbers. Traditionally Solaris is very picky about the device IDs, and changing things like controllers or SCSI IDs on a system can be tricky.

Here I wanted to know if I can just create, say, a “travelling zfs pool”, where I’ll have an external enclosure with a few SATA drives, an internal PCI SATA controller card, and if things go wrong in a particular system, I could always unplug the drives, and move them to a different system, and things will work. So I wanted to find out if ZFS can deal with changes in device IDs.

In order for ZFS to work reliably, it has to use a whole drive. It, in turn, writes an EFI disk label on the drive, with a unique identifier. Note that certain PC motherboards choke on EFI disk labels, and refuse to boot. Luckily most of the time this is fixable using a BIOS update.

root@dara:/[03:00 AM]# uname -a
SunOS dara.NotBSD.org 5.11 snv_51 i86pc i386 i86pc
root@dara:/[03:00 AM]# zpool create raid1 raidz c0d0 c0d1 c1d0 c1d1
root@dara:/[03:01 AM]# zpool status
  pool: raid1
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        raid1       ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c0d0    ONLINE       0     0     0
            c0d1    ONLINE       0     0     0
            c1d0    ONLINE       0     0     0
            c1d1    ONLINE       0     0     0

errors: No known data errors
root@dara:/[03:02 AM]# zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
raid1                  1.09T    238K   1.09T     0%  ONLINE     -
root@dara:/[03:02 AM]# df -h /raid1 
Filesystem             size   used  avail capacity  Mounted on
raid1                  822G    37K   822G     1%    /raid1
root@dara:/[03:02 AM]# 

Here I created a raidz1 (zfs equivalent of RAID5 with one parity disk, giving me (N-1)*[capacity of the drives]. raidz can survive death of one hard drive. zfs pool can also be creatd with raidz2 command, giving an equivalent of raid5 with two parity disks. Such configuration can survive death of 2 disks) pool.

Note the difference in volume that zpool list and df produce. zpool list shows capacity not counting parity. df shows the more traditional available disk space. Using df will likely cause less confusion in normal operation.

So far so good.

Then I proceeded to create a large file on the ZFS pool:

root@dara:/raid1[03:04 AM]# time mkfile 10g reely_beeg_file

real    2m8.943s
user    0m0.062s
sys     0m5.460s
root@dara:/raid1[03:06 AM]# ls -la /raid1/reely_beeg_file 
-rw------T   1 root     root     10737418240 Nov 10 03:06 /raid1/reely_beeg_file
root@dara:/raid1[03:06 AM]#

While this was running, I was running zpool iostat -v raid1 10 in a different window.

               capacity     operations    bandwidth
pool         used  avail   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
raid1        211M  1.09T      0    187      0  18.7M
  raidz1     211M  1.09T      0    187      0  18.7M
    c1d0        -      -      0    110      0  6.26M
    c1d1        -      -      0    110      0  6.27M
    c0d0        -      -      0    110      0  6.25M
    c0d1        -      -      0     94      0  6.23M
----------  -----  -----  -----  -----  -----  -----

               capacity     operations    bandwidth
pool         used  avail   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
raid1       1014M  1.09T      0    601      0  59.5M
  raidz1    1014M  1.09T      0    601      0  59.5M
    c1d0        -      -      0    364      0  20.0M
    c1d1        -      -      0    363      0  20.0M
    c0d0        -      -      0    355      0  19.9M
    c0d1        -      -      0    301      0  19.9M
----------  -----  -----  -----  -----  -----  -----

[...]
               capacity     operations    bandwidth
pool         used  avail   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
raid1       8.78G  1.08T      0    778    363  91.1M
  raidz1    8.78G  1.08T      0    778    363  91.1M
    c1d0        -      -      0    412      0  30.4M
    c1d1        -      -      0    411  5.68K  30.4M
    c0d0        -      -      0    411  5.68K  30.4M
    c0d1        -      -      0    383  5.68K  30.4M
----------  -----  -----  -----  -----  -----  -----

10 gigabytes written over 128 seconds. About 80 megabytes a second on continuous writes. I think I can live with that.

Next I wanted to run some md5 digests of some files on the /raid1, then export the pool, shut system down, switch around IDE cables, boot system back up, reimport the pool, and re-run the md5 digests. This would simulate moving a disk pool to a different system, screwing up disk ordering in process.

root@dara:/[12:20 PM]# digest -a md5 /raid1/*
(/raid1/reely_beeg_file) = 2dd26c4d4799ebd29fa31e48d49e8e53
(/raid1/sunstudio11-ii-20060829-sol-x86.tar.gz) = e7585f12317f95caecf8cfcf93d71b3e
root@dara:/[12:23 PM]# zpool status
  pool: raid1
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        raid1       ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c0d0    ONLINE       0     0     0
            c0d1    ONLINE       0     0     0
            c1d0    ONLINE       0     0     0
            c1d1    ONLINE       0     0     0

errors: No known data errors
root@dara:/[12:23 PM]# zpool export raid1
root@dara:/[12:23 PM]# zpool status
no pools available
root@dara:/[12:23 PM]#

System was shutdown, IDE cables switched around, system was rebooted.

root@dara:/[02:09 PM]# zpool status
no pools available
root@dara:/[02:09 PM]# zpool import raid1
root@dara:/[02:11 PM]# zpool status
  pool: raid1
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        raid1       ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c1d0    ONLINE       0     0     0
            c1d1    ONLINE       0     0     0
            c0d0    ONLINE       0     0     0
            c0d1    ONLINE       0     0     0

errors: No known data errors
root@dara:/[02:11 PM]# 

Notice that the order of the drives changed. Was c0d0 c0d1 c1d0 c1d1, and now it’s c1d0 c1d1 c0d0 c0d1.

root@dara:/[02:22 PM]# digest -a md5 /raid1/*
(/raid1/reely_beeg_file) = 2dd26c4d4799ebd29fa31e48d49e8e53
(/raid1/sunstudio11-ii-20060829-sol-x86.tar.gz) = e7585f12317f95caecf8cfcf93d71b3e
root@dara:/[02:25 PM]#

Same digests.

Oh, and a very neat feature…. You want to know what was happening with your disk pools?

root@dara:/[02:12 PM]# zpool history raid1
History for 'raid1':
2006-11-10.03:01:56 zpool create raid1 raidz c0d0 c0d1 c1d0 c1d1
2006-11-10.12:19:47 zpool export raid1
2006-11-10.12:20:07 zpool import raid1
2006-11-10.12:39:49 zpool export raid1
2006-11-10.12:46:14 zpool import raid1
2006-11-10.14:09:54 zpool export raid1
2006-11-10.14:11:00 zpool import raid1

Yes, zfs logs the last bunch of commands on to the zpool devices. So even if you move the pool to a different system, command history will still be with you.

Lastly, some versioning history for ZFS:

root@dara:/[02:19 PM]# zpool upgrade raid1 
This system is currently running ZFS version 3.

Pool 'raid1' is already formatted using the current version.
root@dara:/[02:19 PM]# zpool upgrade -v
This system is currently running ZFS version 3.

The following versions are suppored:

VER  DESCRIPTION
---  --------------------------------------------------------
 1   Initial ZFS version
 2   Ditto blocks (replicated metadata)
 3   Hot spares and double parity RAID-Z

For more information on a particular version, including supported releases, see:

http://www.opensolaris.org/os/community/zfs/version/N

Where 'N' is the version number.
root@dara:/[02:19 PM]# 

Power consumption and hard drives

Some numbers about power consumption of hard drives….

Maxtor DiamondMax 10 6L300R0, 7200 RPM, 300 gig (279.48GB formatted) ATA hard drive has the following power consumption: +5V 740 mA, +12V 1500 mA.

Seagate Barracuda ST3300831A, 7200 RPM, 300 gig (279.45GB formatted) ATA hard drive has the following power consumption: +5V 460 mA, +12V 560 mA.

Seagate tech spec sheet claims that their ‘cudas also take 2.8 amps of +12V to spin up. Maxtor doesn’t have a useful spec sheet for their product.

Observations: Seagate has a 5 year warranty on their drives. Lower power consumption means lower power dissipation, and thus cooler system. Lower power consumption means that you can get away with smaller power supply (or more drives in a system), and thus reduce your power consumption costs (that are more of an issue in a 24/7 environment) and air conditioning/cooling costs.

Conclusions: One should spec hard drives not only from the point of view of costs (WD is cheap but in my experience dies like a butterfly under a cold spell), but from the point of view of warranty and power consumption. Sadly vendors do not provide power consumtion information in their spec sheets, so the only way to find it out is by going to a computer store, asking to look at an OEM drive, and reading off the numbers.

Stoopid WordPress

Just a quick rant.

I [censored] hate WordPress’ “I am smarter then stoopid user” attitude, esp attempts to close what it thinks to be HTML tag. #include <foo.h> in between <pre> and </pre> does NOT mean that it needs to change the code to #include <foo .h> (see space?), and add </foo> at the end of the post for me. Aaaaarrggh!

Why the [censored] does software assume that user is stupider then it is? Why isn’t there a way to turn off input sanitization? Case in point with HTML: <p> tag doesn’t need </p> at the end of the paragraph. Never did. And in my world never will. I just want a paragraph break, dammit!!! Same idea with <br> tag. I just want a newline, not <br> to close it off (and WTF is <br> any way?)

Oh, and if I add &lt s in the body, next time around editing, they get replaced by <s, which upon next save end up tripping Word Press’ input sanitization insanity. Why why oh why?

Aaaaarrgggh! /me bangs head on the wall

We obviously overengineered. Maybe it’s time to EMP every computer on earth and start all over again.

Mac OS X/mach: Identifying architecture and CPU type

Platform independent endinanness check:

#include <stdio.h>
union foo
{
  char p[4];
  int k;
};

int main()
{
  int j;
  union foo bar;
  printf("$Id: endianness.c,v 1.1 2006/07/09 17:48:14 stany Exp stany $nChecks endianness of your platformn");
  printf("Bigendian platform (ie Mac OS X PPC) would return "abcd"n");
  printf("Littleendian platform (ie Linux x86) would return "dcba"n");
  printf("Your platform returned ");
  bar.k = 0x61626364;
  for(j=0; j<4 ; j++) 
  {
  printf("%c",bar.p[j]);
  }

  printf("n");
  return 0;

}

Platform dependent tell me everything check:

/*
 * $Id: cpuid.c,v 1.2 2002/08/03 23:38:39 stany Exp stany $
 */

#include <mach-o/arch.h>
#include <stdio.h>

const char *byte_order_strings[]  = {
        "Unknown",
        "Little Endian",
        "Big Endian",
};

int main() {

  const NXArchInfo *p=NXGetLocalArchInfo();
  printf("$Id: cpuid.c,v 1.2 2002/08/03 23:38:39 stany Exp stany $ n");
  printf("Identifies Darwin CPU typen");
  printf("Name: %sn", p->name);
  printf("Description: %sn", p->description);
  printf("ByteOrder: %sn", byte_order_strings[p->byteorder]);
  printf("CPUtype: %dn", p->cputype);
  printf("CPUSubtype: %dnn", p->cpusubtype);
  printf("nFor scary explanation of what CPUSubtype and CPUtype stand for, nlook into /usr/include/mach/machine.hnn
ppc750t-tG3nppc7400t-tslower G4nppc7450t-tfaster G4nppc970t-tG5n");


return 0;

Mac OS X: Getting things to run on platforms that are not supported

Purposefully oblique description, I know.

Basically there are two ways of not supporting a platform.

One way is to not support the architecture. If I compile something as ppc64, noone on a G3 or G4 CPU will be able to run it natively, nor will x86 folks be able to run it under Rosetta. I can try to be cute, and compile something for x86 arch, cutting off all PPC folks. I can compile something optimized for PPC7400 CPU (G4). G5 and G4 systems will run it and G3s will not (This is exactly what Apple did with iMovie and iDVD in iLife ’06). Lastly, I can compile something in one of the “depreciated” formats, potentially for Classic, and cut off x86 folks, and annoy all PPC folks who would now have to start Classic to run my creation. Oh, the choices.

The other way is to restrict things by the configuration, and check during runtime.

Procedure for checking that the architecture you are using is supported by the application.

bash$ cd Example_App.app/Contents/MacOS
bash$ file Example_App
Example_App: Mach-O fat file with 2 architectures
Example_App (for architecture ppc):  Mach-O executable ppc
Example_App (for architecture i386): Mach-O executable i386

or

bash$ cd Other_Example/Contents/MacOS
bash$ file Other_Example
Other_Example: header for PowerPC PEF executable

Step 2a) If application is Mach-O, then you can use lipo to see if it’s compiled as a generic or as a platform specific:

bash$ lipo -detailed_info Example_App
Fat header in: Example_App
fat_magic 0xcafebabe
nfat_arch 2
architecture ppc
    cputype CPU_TYPE_POWERPC
    cpusubtype CPU_SUBTYPE_POWERPC_ALL
    offset 4096
    size 23388
    align 2^12 (4096)
architecture i386
    cputype CPU_TYPE_I386
    cpusubtype CPU_SUBTYPE_I386_ALL
    offset 28672
    size 26976
    align 2^12 (4096)

If you see CPU_SUBTYPE_POWERPC_ALL, application is compiled for all PowerPC platforms, from G3 to G5.

What you do not want to see on a G3 or G4 system is:

bash$ lipo -detailed_info Example_App
Fat header in: Example_App
fat_magic 0xcafebabe
nfat_arch 1
architecture ppc64
    cputype CPU_TYPE_POWERPC64
    cpusubtype CPU_SUBTYPE_POWERPC_ALL
    offset 28672
    size 8488
    align 2^12 (4096)

Then you need a 64 bit platform, which amounts to G5 of various speeds.

It is possible that the application is in Mach-o format, but not in fat format.
otool -h -v will decode the mach header, and tell you what cpu is required:


Step 2b) If application is PEF (Preferred Executable Format) or CFM (Code Fragment Manager) things might be harder.  I've not yet encountered a CFM or PEF app that would not run on PPC platform in one way or another, so this section needs further expantion. 



In case of a runtime check, most commonly it is the platform architecture that is checked. 

Some Apple professional software has something like this in AppleSampleProApp.app/Contents/Info.plist
        AELMinimumOSVersion
        10.4.4
        AELMinimumProKitVersion
        576
        AELMinimumQuickTimeVersion
        7.0.4
        ALEPlatform_PPC
        
                AELRequiredCPUType
                G4
        
        CFBundleDevelopmentRegion
        English

Getting rid of

        ALEPlatform_PPC
        
                AELRequiredCPUType
                G4
        

tends to get the app to run under G3.

Lastly, if application says something similar to “Platform POWERBOOK4,1 Unsupported”, maybe running strings on SampleApplication/Contents/MacOS/SampleApplication combined with grep -i powerbook can reveal something.

bash$ strings SampleApplication | grep POWER
POWERBOOK5
POWERBOOK6
-
 POWERBOOK6,3
POWERMAC7
POWERMAC9,1
POWERMAC3,6
           POWERMAC11,2
-
 POWERMAC11,1

So if you want to run this application on 500Mhz iBook G3 for some reason (hi, dAVE), it might make sense to fire up a hexeditor, and change one of the “allowed” arches to match yours.

For example to this:

bash$ strings SampleApplication | grep POWER
POWERBOOK4
POWERBOOK6
-
 POWERBOOK6,3
POWERMAC7
POWERMAC9,1
POWERMAC3,6
           POWERMAC11,2
-
 POWERMAC11,1

But don’t mind me. I am just rambling.

Video: DVD Studio Pro – I love you!

Spent a good chunk of today fighting with DVD creation. My previous attempt was using iMovie HD 6 and iDVD 6. Now, I have some issues with iDVD:

Suppose your projects are saved on an external drive, both iMovie and iDVD. You’d think that when you are rendeering the final DVD, it would still keep all the bits and pieces on the external drive, where the data is saved. Oh, no. iDVD is different. iDVD is smarter then the user, and will try to save the intermediate audio track (before the muxing) in /tmp If you have PCM audio track (and all of my previous projects did, because I didn’t know any better. Me Ogg. Ogg stoopid, remember?). Now, imagine that you are running out of disk space on your internal drive as it is, and then out of nowhere another 1.5 – 2 gigs of stuff show up in /tmp. Of course iDVD would die at this point.

iDVD 6 is also temperamental about DVD mastering, and was refusing to even think about creating a dual layer DVD (ie something longer then 4.7 gigs in size) if I didn’t have a dual layer burner plugged into the iBook. Fiar enough, I gave it the drive.

Then after running all night, it would die with a numerical error code (I’ve googled for it, noone saw it before). I tried three times, as originally I thought that I might have exceeded the “TV safe” area on the menu (another famous was of getting iDVD to die) with DVD title, or somesuch. But no, it just wouldn’t work.

So I got access to a machine with Final Cut Studio installed on it.

Oh, what a joy.

Software actuallly uses the location you tell it to use, without arbitrarily using what it should not. Software tells you what it thinks you should do, but lets you overrule it if you think you know better. Sane software.

I’ve plugged in my external hard drive, and imported into DVD Studio Pro DV streams which I used in iMovie and iDVD without much success before. It happily dealt with them.

Quickly I created a timeline. I’ve had everything pretty much pre-rendered, so it was as simple as setting a bunch of chapter breaks, creating a menu and linking the buttons to actions (ie Play chapter 1).

System I was using is an elderly G4 1.5 Ghz which was kind of skipping frames when dealing with large streams, and thus creating chapters was a bit of an excercise in patience. I’ve opened the iMovie project, and looked at places where I’ve placed chapter breaks before. In DVD studio I’ve created a bunch of chapter breaks arbitrarily, and then adjusted the times, so they would match more or less what I had in iMovie.

Worked as it was supposed to. Beautiful.

DVD Studio was telling me that my project would compress down to 5.1 gigs. At this point I thought that I should just do it, and then run it through DVD2OneX or somesuch, and shrink it down to 4.7 gigs, and told it to go ahead and just do it. It happily rendered to hard drive (it also asked me where I want to set layer breaks in dual layer disk, which was really nice too).

Eventually I realized that there is such a thing as Compressor, that can take a component of a multiplexed stream, and convert it to a different format.

After taking two 12 gig DV streams, and running them through Compressor, I’ve converted the audio tracks on both streams from PCM audio to Dolby 2.0 AC3.

Once I’ve imported the AC3 streams into DVD Studio Pro project, deleted the PCM audio from the timeline, and added in ac3 audio, projected project size dropped from 5.1 gigs to 4.1 gigs, and actual project size (once assets were rendered) dropped from 4.9 gigs to 3.6 gigs (I’ve used crappy video as DV source, from video tapes that were sitting in storage for god knows how long, so they compressed a fair bit).

So overall, I am really really happy with DVD Studio, although I’ve not used even 1/10’s of it’s capabilities. It can create HD DVDs. It can embed web links in mpeg files. It can edit existing menus. Now I need to save up my shekels to buy it (899 CAD for student license for Final Cut Studio).

Ecological Impact of Gypsy Moth (Lymantria dispar)

I had to do this paper as part of independent research for BIOL 1004 (Biology II) at Carleton this summer. As I’ve handed it in (two days ago), I am posting it here as well.

Stany, 20060622


Department of Biology

Introductory Biology II
Summer Term 2006

Ecological Impact of Gypsy Moth

(Lymantria dispar)



Date Due: 20060720

“This paper is the sole work of the undersigned, does not contain unattributed material from any source and compiles with the Academic Regulations section 14.1-4 (Instructional Offences) of the Carleton University Calendar.” (Biology Department, 2006, p10).

Signed:

Превед Кроссафчег

Stanislav N. Vardomskiy
SN: 1006XXXXX

Introduction

In North America, gypsy moth is a serious pest of agriculture and deciduous forests that causes significant economical and environmental damage.

Gypsy moth (Lymantria dispar) is an insect native of Asia and Europe with very few natural predators in North America (Chaplin III, 2000). Asian and European races of Lymantria dispar differ by size, flight characteristics and host preferences. Asian gypsy moth is larger then it’s European counterpart and is known to prefer over 500 tree species. In addition, both genders of Asian gypsy moth are strong fliers, compared to only males of European gypsy moth (Humble and Stewart, 1994).

Until recently, most of attention to gypsy moth in North America centered around European gypsy moth, however in 1991 a race of Asian gypsy moth was discovered in Vancouver, BC and in the states of Washington, Oregon and Ohio (Humble and Stewart 1994 and APHIS 2003).

European Gypsy Moth

In late 1860s, Etienne Leopold Trouvelote, an amateur entomologist, imported a gypsy moth egg cluster from France in hopes of cross-breeding disease-resistant gypsy moth and local varieties. He cultured some of these eggs in the trees of his suburban Boston home, when some of the larvae escaped and infected nearby trees – first on his street, and soon in the neighborhood of Boston. (Leibhold, 2003)

Trouvelote realized the significance of escaped larvae, and notified local entomologiests, however for close to 20 years problem was largely ignored (Leibhold, 2003). Gradually more and more trees in the vicinity got infected.

First outbreak of moth occurred on his street in 1882, just as he left the country, but at the time very little was done. First attempt at containment and eradication of gypsy moth larvae was organized by Massachusetts State Board of Agriculture in 1889. At the time efforts consisted of manual removal of egg clusters, application of early insecticide, and burning of infected trees. A lot of money and effort was spent, however infestation continued to spread. Eradication methods in Massachusetts were abandoned by 1900 (Leibhold, 2003).

In Canada European gypsy moth is well established in the provinces of Quebec and Ontario and threatens parts of New Brunswick and Nova Scotia (Humble and Stewart, 1994).

Asian Gypsy Moth

Asian race of gypsy moth was accidentally introduced to Vancouver in 1991, when larvae hatched on ships in harbor was blown ashore by the wind. Male moths were trapped, and application of insecticide Btk eradicated the problem. Currently egg masses are increasingly detected on the ships, and since 1991 infected ships have been banned from inshore areas during periods of egg hatch and larval development (Humble and Stewart, 1994).

Asian gypsy moth is not established in Canada, however egg masses have been intercepted in shipments as early as in 1911, and have been intercepted almost yearly since 1982 (Humble and Stewart, 1994). In United States individual infestations occurred in Washington and Oregon 1991 and in North Carolina in 1997. In 2000 Asian gypsy moth were again discovered in Portland, OR. In all cases infestations were eradicated through aggressive trapping and spraying (APHIS 2003).

Gypsy Moth Life Cycle

Life cycle of gypsy moths consists of four stages: eggs, larva, pupae and adult moths. Adult moths generally lay egg clusters on tree trunks and branches, however any sheltered location can be used. Egg clusters are laid in August and the embryos develop over the warm days of summer. In about a month larvae is fully formed, and ready to hatch, however, instead larvae shuts down metabolic activities, and goes into diapause, becoming insensitive to cold. In the spring, as the temperature increase, larvae inside the eggs becomes more and more active. In mid-May larvae chews through the egg shells, and emerges (Duvall, 2006)

Before commencing feeding, larvae spreads through the forest by a behavior called ballooning. The larvae climbs to the top of the tree on which it hatched, and proceeds to dangle in the air on a silk thread. At this point larvae is still very light, so when wind catches larvae and breaks the thread, larvae is carried on the wind. Silk thread and long body hairs slow larvae’s descent. Most larvae land within 100 meters of where whey hatched (Durvall 2006), however some travel as far as a kilometer away from the hatch site (Sharov 1997).

Once larvae lands, it proceeds to feed. Depending on sex, larvae will feed for five to six weeks. Females feed longer, in order to collect fat necessary for laying eggs. Approximately once a week larvae grow too big for it’s exoskeleton, and molts. Molts separate the larval periods into stages called instars. In the first three instars larvae feeds during the day, however by fourth instar they start to feed at night and hide during the day in order to avoid predators (Duvall 2006). Approximately 90% of total leaf mass will be consumed by larvae in the last two instars (Herms and Shutlar 2000).

In five or six weeks, larva grows to the size of 4 to 6 cm. By mid-June – early July, larva reaches maturity, and starts looking for a safe place to pupulate. Once a safe spot is found, larva sheds its’ skin, and it’s new skin hardens into a brown shell. In process larva can hide on vehicles and spread further during pupitation. Pupae is immobile during most of this stage, as its’ body is transformed into that of a winged insect. After one to two week pupation, adult moth breaks free of a pupal shell and emerges (Duvall, 2006).

Adult gypsy moth females are about 4 cm long, and are white with black stripe on their forewings. Females of European race can not fly, and will fall to the ground if disturbed, while Asian race females will fly away. Male gypsy moths are larger then females, have large feathery antennae, and a mottled grey and brown in color, giving them similarity to native moth species. Male gypsy moths search for females in late afternoons, that allows to distinguish them from native species that search for mates at night (Duvall, 2006).

In the adult stage gypsy moths can not feed, and have about 2 weeks in which to mate. Females release pheromones that assist males in finding them. Male searches for pheromone trace, and flies up wind until finds a suitable female. Once a male and female moths find each other and mate, female lays all her eggs in a single tear-dropped shape and camouflages them with it’s own yellowish hair. Depending on how well female larvae fed in the last two instars, female can lay between 50 and 1000 eggs (Duvall, 2006).

Impact

In the larval stage of the lifecycle, gypsy moth consumes tree foliage. European race is known to favor approximately 300 plant species, while Asian race is known to consume foliate of approximately 500 plant species (Humble and Stewart, 1994). During the first three instars, gypsy moths prefer foliage of a limited selection of trees (apple, aspen, birch, larch, oak, willow, alder, hazel, etc), however once larvae gets to approximately 2 cm in size (third instar), it starts to consume foliage of many more trees, such as spruce, pine, chestnut and hemlock (Ravlin and Stein 2001).

As majority of foliage is consumed by larvae in the last two instars, very wide variety of trees can be affected.

Ravlin and Stein did work on tree classification that permits to statistically analyze forest composition, and predict the defoliation effects of an infestation. Generally forests that have a high composition of ash, balsam and Fraser fir, juniper, maple, mulberry, red cedar or sycamore are significantly less affected then forests that primarily consist of oak and birch (Ravlin and Stein, 2001).

Approximately once every 5 to 10 years a very severe infestation, termed outbreak occurs. In case of gypsy moth, early theories postulated that in low density infestation small mammal predators, such as deer mice, regulate the population, keeping equilibrium. At some point natural population of predators drops because of random failure in some other food source, and moth population rapidly jumps to a higher equilibrium level. As the density of moth population increases, various pathogens rapidly infect the population, causing the collapse of the outbreak. Current theories suggest that this is only part of a story, and involve induce-defence hypothesis, that postulates that decrease in available foliage causes a decrease in moth population (Stone 2004) – in other words, moths consume all available food and starve out.

Furthermore, Jones demonstrated that while in the northeastern United States large population of the white-footed mice control outbreaks of gypsy moth, white-footed mice also spread Lyme disease, whereas small population of the mice decrease incidence of Lyme disease but allow gypsy moth to breed (Jones 1998). Relationships such as these make theoretic explanations of outbreaks extremely complicated.

Depending on the severity of infestation, up to 100% of the tree foliage can get destroyed. Normally a healthy tree would survive such an event, and generate a second generation of trees by end of July, however any strained tree would be further stressed. In turn, stressed trees are more susceptible to fungus and diseases, and do not grow as much as unaffected trees.

Establishment of gypsy moth in any new habitat can causes economical damage. Any lumber, tree nursery products or natural products leaving affected area could have trading restrictions applied to them. Affected forests grow slower, with higher incidence of tree death. As larvae eats leaves of fruit trees, blueberries, strawberries and other foodcrops, gypsy moth has potential to severely affect agriculture (BCgov 2006). Asian race of gypsy moths is less picky about their food, and consumes coniferous trees, such as larch (Humble and Stewart, 1994).

During outbreaks, gypsy moth caterpillars are considered to be a nuisance in residential areas of Eastern North America. In urban environments larvae can congregate on buildings, driveways and sidewalks, as they search for food. Caterpillar hairs, shed by larvae are allergens that cause hazards to human health. (BCgov 2006).

Containment and Control

Gypsy moth is an exotic invasive species in North America, and doesn’t have as many natural controls in North America as it does in Europe or Asia. In North America natural predators of gypsy moth include birds, insects, and small mammals (Herms & Shetlar, 2000) with most important being shrews (Sorex spp), deer mice (Peromyscus maniculatus) (Leibhold 2003b) and white footed mice (Peromyscus leucopus) (Jones 1998). As most small mammals are generalists, there is no strong correlation between abundance of moths and abundance of small mammals (Leibhold 2003b).

Presence of hair on larvae makes that moth lifestage unattractive to most birds, but a few species, such as yellow-billed (Coccyzus americanus) (MSU 1997) and black-billed cuckoo (Coccyzus erythropthalmus) seem to enjoy eating larvae. Overall, in North America birds do not significantly contribute to the decline of gypsy moth population (Leibhold 2003b).

It is established that gypsy moth in North America can not be eradicated (Leibhold, 2003) so current efforts are concentrated on reduction of damage and on prevention of infestation (Diss 1998).

Damage reduction consists of silvicultural (change in tree planting and harvesting) control to make forests less habitable by the moth and minimize the damage, biological control to slow the growth of population and control outbreaks, killing the caterpillars and removal of egg masses (Diss 1998).

Prevention consists of inspection and quarantine of vehicles that might transport larvae (Humble and Stewart 1994), combined with monitoring for new infestations.

Mating pheromones of gypsy moth, disparlure ((7R,8S)-7,8-Epoxy-2-methyloctadecane and cis-7,8-Epoxy-2-methyloctadecane) were synthesized in 1970s, and since then many attempts were made to manage low-level infestations by disrupting mating habits. Disparlure was found to be effective only in low density infestations (Sharov et al. 2002), or as trap bait in order to check for presence of males (Humble and Stewart, 1994).

Over 20 species of insect predators and parasites have been released in wild in order to control population of gypsy moth (Leibhold 2003a) with various degrees of success.

Natural bacteria Bacillus thuringiensis var. kurstaki is the base of a commercial available insecticide Btk that is commonly used against gypsy moth infestations (Humble and Stewart, 1994). Unfortunately Btk is extremely sensitive to timing, and is only effective for a few days after being spread. In that time slot it must be consumed by feeding larva in order for it to be effective (KC 2006). Statistics gathered by Washington State Department of Agriculture indicate that Btk based insecticides are fallible, and possibly produce effects that are not better then disparlure (WSDA 2005).

Gypsy moth is most susceptible to nucleopolyhedrosis virus (NPV), more commonly known as the “wilt”. Infection happens once the larvae consumes foliage that is contaminated with viral bodies. Once inside the larvae, NPV invades through the gut wall, and rapidly reproduces in internal tissues, disintegrating internal organs and eventually causing rapture. Once host raptures, viral oclusion bodies spread, and infect other individuals (Leibhold 2003c).

NPV particles persist in the soil, and in low density gypsy moth populations, however with fewer hosts to infect, NPV causes little mortality. During moth outbreaks, NPV rapidly propagates, and inflicts heavy casualties on the larvae population. NPV is the most common cause of the collapse of the outbreaks.

Research is being performed on development of NPV into a biological pesticide. Currently limited qualities of this material, referred to as “Gypchek” are available for control of the outbreaks, however it is costly to produce, as manufacturing process currently requires moth larvae (Leibhold 2003c).

While total eradication of gypsy moth in North America is currently not possible, containment measures consisting of infestation prevention and damage reduction are slowing down gypsy moth proliferation (Diss 1998). Leibhold indicates that only about 25% of the potential habitat of gypsy moth have in fact been infected so far (Leibhold 1992, Leibhold 2003).

Reference

APHIS 2003, Asian Gypsy Moth, United States Department of Agriculture, Animal and Plant Health Inspection Service http://www.aphis.usda.gov/lpa/pubs/fsheet_faq_notice/fs_phasiangm.html Accessed 20060620

Biology Department. 2006. Introductory Biology II BIOL 1004 Summer Term Laboratory Manual, Carleton University Press, Ottawa, Ontario

BCgov 2006 Gypsy Moth Government of British Columbia http://www.agf.gov.bc.ca/cropprot/gypsymoth.htm Accessed 20060617

Chaplin III, F. Stuart, Zavaleta, Erica S., Eviner, T. Valierie, et all. 2000. Consequences of Changing Biodiversity, Nature, vol 405 p234-242

Diss, Andrea, 1998. Containing Gypsy Moth, Wisconsin Natural Resources Magazine,
http://www.wnrmag.com/stories/1998/aug98/gypsy.htm Accessed 20060619

Duvall, Matt. 2006 Gypsy Moth in Wisconsin – Lifecycle and Biology Wisconsin Department of Natural Resources http://www.uwex.edu/ces/gypsymoth/lifecycle.cfm Accessed 20060614

Herms, Daniel A., Shetlar, David J. 2000 Accessing Options for Managing Gypsy Moth Ohio State University, Columbus, Ohio

Humble, L., Stewart, A.J. 1994 Forest Pest Leaflet: Gypsy Moth Canadian Forest Service, Natural Resources Canada, Burnaby, BC. http://www.pfc.cfs.nrcan.gc.ca/cgi-bin/bstore/catalog_e.pl?catalog=3456 Electronic version accessed on 20060619

Jones, C. G., Ostfeld, R. S., Richard, M. P., Schauber, E. M. & Wolff, J. O. 1998. Chain reactions linking acorns to gypsy moth outbreaks and Lyme disease risk. Science vol 279, p1023–1026

KC 2006 Pest Control Public Health, Seattle and King County http://www.metrokc.gov/health/env_hlth/gypsy.htm Accessed 20060613

Liebhold A.M., Halverson J.A. & Elmes G.A. 1992. Gypsy moth invasion in North
America: a quantitative analysis. J. Biogeog., 19, p513-520. Electronic Version
http://www.jstor.org/view/03050270/dm995533/99p0135v/0?currentResult=03050270%2bdm995533%2b99p0135v%2b0%2cEF01&searchUrl=http%3A%2F%2Fwww.jstor.org%2Fsearch%2FAdvancedResults%3Fhp%3D25%26si%3D1%26All%3DGypsy%2Bmoth%26Exact%3D%26One%3D%26None%3D%26sd%3D%26ed%3D%26jt%3D%26ic%3D03050270%26ic%3D03050270%26node.Biological+Sciences%3D1%26node.Ecology%3D1 Accessed 20060615

Leibhold, Sandy. 2003 E. Leopold Trouvelot, Perpetrator of our Problem USDA Forest Service http://www.fs.fed.us/ne/morgantown/4557/gmoth/trouvelot/ Accessed 20060617

Leibhold, Sandy 2003a Gypsy Moth in North America USDA Forest Service
http://www.fs.fed.us/ne/morgantown/4557/gmoth/ Accessed 20060620

Leibhold, Sandy 2003b Gypsy Moth Natural Enemies – Vertebrates USDA Forest Service http://www.fs.fed.us/ne/morgantown/4557/gmoth/natenem/mammals.html Accessed 20060619

Leibhold, Sandy 2003c Gypsy Moth Nucleopolyhedrosis Virus USDA Forest Service http://www.fs.fed.us/ne/morgantown/4557/gmoth/natenem/virus.html Accessed 20060619

MSU 1997 Natural Enemies of Gypsy Moth Michigan State University http://www.ent.msu.edu/gypsyed/docs/enemies.html Accessed 20060918

Ravlin, William F., Stein, Kenneth J. 2001 Feeding preferences of gypsy moth caterpillars Virginia Tech http://gypsymoth.ento.vt.edu/vagm/Feeding_prefs_Mason.html Accessed 20060619

Sharov, Alexei. 1997 Model of Slowing Gypsy Moth Spread Department of Entomology, Virginia Tech. http://www.gypsymoth.ento.vt.edu/~sharov/sts/barrier.html Accessed 20060619

Sharov Alexei A, Leonard D, Liebhold A M, Clemens NS. 2002. Evaluation of preventive treatments in low-density gypsy moth populations using pheromone traps. J. Econ Entomol. 2002 Dec 95(6) p1205-15.

Stone, Lewi. 2004. A Three-Player Solution, Nature, vol 430, p299-300

WSDA 2006 Gypsy Moth Facts – January 2006 Washington State Department of Agriculture http://agr.wa.gov/PlantsInsects/InsectPests/GypsyMoth/FactSheet/docs/FactSheet2006.pdf Accessed 20060620

WSDA 2005 Gypsy Moth Report – Summary Report 2005 Washington State Department of Agriculture
http://agr.wa.gov/PlantsInsects/InsectPests/GypsyMoth/SummaryReports/docs/2005GMSummaryReport.pdf Accessed 20060619

Video: Video archives

I digitize various video footage, some of which is obtained from degrading video tapes, and needs to be color corrected.

Oldest footage so far was from a 1982 black and white Beta video, that was converted to VHS in 1988, timecode added, yet improperly stored since.

I keep the scenes that I feel are important, and eventually create DVDs in iMovie/iDVD, however I am not 100% sure that bits that I discard are not important.

Thus I am interested in preserving full footage as well, at least for 5, although realistically for over 10 years.

I can keep it on video tapes, yet am not certain that it’s a good idea, considering how much some of this footage degraded already. I am also concerned about availability of VCRs capable of reading video tapes in 10 year time frame.

One approach is to keep it in uncompressed DV format on a hard drive, drop the hard drive into a bank vault. As I am dealing with ~300 hours of video all together, this is not realistic, esp if I want a backup. Besides, will I be able to read HFS+, ext3fs, NTFS or whatever? EIDE? SATA? Firewire? USB?

I’ve been considering compressing it in full NTSC (or PAL) resolution to DivX 6 format, and burning it to DVDs (or on one or two external hard drives). Yes, I technically lose quality, but then again, most of my source material is not stellar as it is.

At this point I am concerned as well… Will I be able to decode it in 10 years? Will DVD drives still be available, or will they go the way of (5.25″) floppy drives?

Re-copying every 3 years?

Suggestions?

Is there anything better then DivX that I should look into? I am concerned about the disk space to quality ratio, and this project has no budget (ie I currently finance it out of my own pocket)

Video: Video digitization workflow

It so happened that I got involved in video digitization project. Here is a quick description of my setup and workflow.

Key points

Video capture of DV video streams using QuickTime is not effective, as QT will try to encapsulate DV stream, resulting in both high cpu usage/dropped frames. Capture can be performed either using iMovie or using digital VCR application in FireWire SDK 22. This normally results in less then 50% CPU usage on a 1.2 Ghz iBook, and generates a proper DV stream.

In order to speed up exporting and importing data when working with iMovie, one can select “show package contents” in finder, and look inside the iMovie project. Capture.iMovieProject/Media/ contains the video streams, which can be moved out and edited in stand-alone applications, or moved in to speed up import procedure a great deal (it helps if streams that are being imported are of the same format as the project, although in iMovie 6 anything that QT supported seemed to work, I just pay for it in terms of conversion time at the final export).


Setup

Currently my tools are a Sony Hi8 video camera, essentially a consumer model, two different VCRs (A Mitsubishi and a Hitachi), and a Canopus ADVC-110 analog to digital adapter.

Canopus box emulates a DV camera, and speaks a subset of DV protocol, so as far as host system is concerned, it is a somewhat dumb DV camera. Somewhat dumb because it silently ignores any of the DV commands that have to do with reading the tape markers or rewinding the tape. Some of the commands tend to confuse it, and it in turn changes from “analog to digital” to “digital to analog” conversion. But in 99 out of a 100 cases it works really well.

Not all VCRs are born equal. My Hitachi generates grainier image using the same footage, however it is more resilient to video tape damage, and doesn’t loose tracking as easily as Mitsubishi does.

Usual way of performing digitization consists of plugging a video camera or VCR into Camopus box, which in turn gets plugged over firewire into an external hard drive, which in turn gets plugged in over firewire into an iBook, that runs iMovie.

iBook runs either iMovie HD 5 or 6 (depending on which iBook). iMovie project is set to “DV” quality, and saved on the the external drive (or wherever there is plenty of storage space). My rule of thumb is 12.5 gigs of disk space per 1 hour of DV video.

Difference between iMovie HD 5 and iMovie HD 6 that I’ve noticed is in the resilience to interference. iMovie 6 seems better at dealing with damaged tapes with dropping video frames. What iMovie HD 5 would show as a full frame dropout, iMovie 6 will only show as a horizontal black line across the image. No differences if source material is of reasonable quality.

Most problems I’ve had were primarily caused by length and quality of firewire cables. Short, well shielded cables and only a couple of devices on firewire bus work best in my experience.

Workflow

In iMovie I enable the video camera capture mode (should be automatic the moment it detects a video camera plugged in), and hit play. iMovie sends a “play” command to Canopus box, which Canopus box happily ignores.

Then I manually cue up the video tape to about where I want to start capturing from, keeping in mind that capturing more is better then capturing less, I can then cue and splice things up in software.

Once I am happy, I click “import” in iMovie, and “press play on tape” (str) , and watch for a while that things are happy.

iMovie imports video in 1 hour chunks, so 12 gig files are about the largest size I deal with.

Once I am done digitizing, I quit iMovie, and look inside the .iMovieProject/Media/ folder, that by now contains Capture nn.dv files. I move .dv files out of the iMovie project, and open these in QuickTime. If video was filmed with a mono audio track, before further editing, I adjust the audio signal to be center-center instead of left-right in the dv stream (Apple-J in QT Pro, followed by clicking on audio, and adjusting the position of the audio channels)

I do rough cuts of the stream in QuickTime as well.

Main .dv file is opened, and scenes out of it that are of interest are copied and pasted into new QT windows, and trimmed as needed.

At this point my usual procedure is to create a folder, and export scenes that are worth preserving out of QT in DV fromat into this folder, using xx0-filename.dv naming scheme. This naming convention is similar to naming lines in BASIC, and allowes me to roughly adjust the scene sequence before importing these back into iMovie.

Once captured streams are edited, I open iMovie again, and create a new project, again in DV format. I save the empty project and again quit iMovie. In my experience iMovie takes forever to import things as it both converts the stream that it imports to the project format, and copies it into project’s Media sub-folder, so it is alot faster to just drop .dv files into Media subfolder of the iMovie project, and re-start iMovie. On a re-start iMovie will complain that there are new tracks in the project, and asks if I want to delete them, or just move them to trash. I select to move them to trash, and once iMovie starts, move them out of the trash into the timeline.

It takes me about an hour and a half to cut and export a moderately representative sample of what I capture, and I discard on average between 40 and 60% of source material.