2013-09-22

Virtualizing a physical linux, without VMware Converter, over the network, and quickly.

Note: Some Linux knowledge is assumed in this guide. After all, if you want to virtualize a linux I guess you know it to some degree.

[I didn't write this with the intention of it being a complete howto/manual, but rather to summarize the steps I took in case I have to do it again.]


I know it's a bit of a long title, but considering everything I found out there, I thought it was necessary to stress it a bit.

This method does NOT require Ghost, KVM, qemu, Virtual Box, vCenter, ESXi, etc.

I will only use normal Linux tools, and VMware Workstation as target platform to run my vm.

I won't install anything on the source OS. We want to keep it clean. So in order to read the disk I will use a live-cd distribution (could also be a live-usb distro).

Boot now from live cd on the source machine.

Now, to make an exact copy of the disk you just need the "dd" program.

With this you can copy the first disk without even needing to mount it.

# dd if=/dev/sda

you can copy it to another drive that you could have attached via usb
# dd if=/dev/sda of=/dev/sdb   (more info about this at the end [X])

to a file in a mounted external drive
# dd if=/dev/sda of=/media/mounted_external_drive/image_of_sda.img

But this would be VERY slow as it would copy the content bit by bit. You may want to add
bs=16M
to copy in chunks of 16MB
(you can read some performance comparisons out there)

and because you are reading from a place that may contain errors, this could come handy
conv=noerror,sync

You can also transfer the content over the network, piping to  ssh or to nc.

ssh Will encrypt the content, which will slow down the process and may not be needed if you are in a safe place. So I will pipe it to nc.

# dd if=/dev/sda conv=noerror,sync bs=16M | nc -q 3  192.168.1.88  19000

-q 3  Will make nc to stop 3 seconds after dd has stopped sending data
The 192.168.1.88 is the destination IP, and the 19000 (PORT) could be anything (better high and weird)

If you want to see the progress, you can put "pv" in the middle
# dd if=/dev/sda conv=noerror,sync bs=16M | pv | nc -q 3  192.168.1.88  19000

But if you don't have it, I will give you later an alternative.
In any case, when the sender is done, dd will give you an average speed for the transfer.


Now, on the receiver system there is some funky stuff.
My receiver is a Windows host, running workstation. In a vm I run the same linux live-cd and I give it a preallocated virtual disk a bit larger than the source physical drive. When this vm boots, I format that vdisk as ntfs and mount it. This vdisk will be a temporal container for the vdisk of the source physical machine (yes, vdisk within vdiks). I format it as ntfs because later we will mount this vmdk on windows and extract the image.

So on the receiver linux live-cd I run:
# nc -l -p 19000 > /mnt/ntfs_drive/source_physical_drive.img

If you want to see the progress of the transfer you can put pv in the middle
# nc -l -p 19000 | pv  > /mnt/ntfs_drive/source_physical_drive.img

OR you can open another console and run
# watch -n 20 ls -lh /mnt/ntfs_drive/
Which will run "ls -lh /mnt/ntfs_drive/" every 20 seconds, and you will be able to see the img file growing fat.



When the transfer is completed, you can unmount /mnt/ntfs_drive/ and shutdown the receiver Linux.

Now, from workstation, map/mount the vmdk that receiver Linux was writing into. Once it is mounted you will see the file source_physical_drive.img inside it.

Copy that file to some place inside the Windows filesystem. Then you can unmap/unmount that vmdk you just mounted.

Now, that file is the content of the source disk, but not in a format that workstation can use, but almost.

We just need to recreate the vmdk descriptor for that image file.

Using msdos, get the size of the img file (command dir on the location of the file)

Lets say it is 95066058752   (bytes)

Now divide that by 1024, and you will have the size in KB,  92837948

Create now a vdisk drive with the same size and the format you desire.

In my case I want:
-t 2                 preallocated single file
-c                   create
-a lsilogic       type
-s 92837948KB    size in KB


So I run
vmware-vdiskmanager -c -t 2 -a lsilogic -s 92837948KB  myVM.vmdk

On the folder you will have:
myVM.vmdk
myVM-flat.vmdk
source_physical_drive.img

Now you can remove myVM-flat.vmdk and rename source_physical_drive.img to myVM-flat.vmdk

You have now the content of your source physical machine in a vdisk format that Workstation can use.

You just need to create a VM and attach this vmdk to it.

I am writing these lines from a Linux that has been running for maaaany years on a Dell Inspiron and now is running on Workstation on another Dell Inspiron some days old.



Notes:
- If you are doing this over wifi, DON'T. You will kill the network. Connect both machines with cable.
- The destination Linux could be running from iso, you don't really need a physical CD. Any knoppix will do the job if you don't know what to use.
- Backup the source system before you start
- Disclaimer: Use this info at your own risk.
- I would normally copy/paste all the commands from the real execution instead of writing them by memory, but due to the circumstances that is not possible. With that being said, I have done my best to avoid any mistake. If you find any please let me know and I will update it.
- To improve speed/performance, connect the receiver linux to the network in workstation-bridge-mode (without replicating network state).


If you want to test the whole procedure but quickly, do this:
- In workstation, install a tiny linux in a vm
- Boot that vm from the linux-live-cd
- Create a 2nd vm with a vdisk and boot it from live-cd
- Do the transfer between both vms and then test the file extraction and conversion. When the transfer is completed you should have the same tiny linux running on 2 vms.
I used puppy linux for this test.
This experiment should not impact the network as all the networking is inside workstation (if you are using NAT connection).

------------------------

[X] If you want to copy from disk to disk and then be able to reshape the size of the partitions (destination may be larger than source and you want to use it all), there is a handy tool that will do all the hard work for you.

It is the gparted live CD. http://gparted.sourceforge.net/index.php

You boot from it, and you can clone partitions from one disk to another (both connected to the computer).

3 comments:

  1. If the destination platform is ESXi instead of VMware Workstation, then a few steps are different.

    Once the destination vm has the image file,
    1. Open a browser to the vCenter Server webserver and log in
    2. Transfer the image file to a Datastore connected to an ESXi (this may take some time, another option (better) is to use scp / winscp)
    3. Log in via ssh (or console) into that ESXi and recreate the descriptor file for that image file.
    4. Join the descriptor with our image file

    Example of steps 3 and 4:
    # my image file is: p2v-linux.img
    # I create a descriptor file with
    vmkfstools -c 92837948KB -a lsilogic -d thin p2v-linux.vmdk
    # Change our img file name, overwriting the -flat.vmdk that we don't need
    mv p2v-linux.img p2v-linux-flat.vmdk
    # Now we should have 2 files, p2v-linux.vmdk and p2v-linux-flat.vmdk

    You can now attach that virtual disk to a new vm and power on.

    ReplyDelete
  2. You know, the linux part is not needed, at least in the destination part.

    You could just use a windows native nc, like the one in Cygwin, or one precompiled specifically for windows, like ncat (see http://nmap.org/ncat/), which is an independent executable.
    Doing it this way also avoids the complexity/storage requirements of having the received .img inside a virtual disk, and having to getting it out. It just lands directly on a Windows folder.

    Just my 2cents

    ReplyDelete
  3. Thanks! I didn't know that.
    That certainly makes the final part easier.

    ReplyDelete