Category: Landmarks

Tickling the Panda

So after having narrowly missed out on starting my Master’s and arriving three and a half weeks late thanks to Immigration hooplas, I finally resumed work on the Arducopter we ordered before I left for home last year. The idea to move to a separate platform was two-fold – first, to allow us to use fancier IMUs to dead reckon better, and second, to allow us to accurately timestamp captured images (using hardware triggered captures) and IMU data so as to help us accurately determine baselines while using SfM algorithms. There is also the added benefit of being able to process data on board using the dual cores on the Pandaboard, cutting the latency issues that also crept up while commanding the ARDrone over WiFi.

I’ve been fiddling around a bit with the ArduCopter source code, and realized that the inertial navigation has been designed to work in between periods of spotty GPS coverage, and not as a standalone solution, which is a perfectly sensible idea for the typical use cases of the ArduCopter. However, since we want to not necessarily depend on the GPS, I realized that all that was needed was a little faking of the GPS in the custom UserCode.pde sketch in the ArduCopter source code.

The good part about the ArduCopter implementing the MAVlink protocol is that I can receive all this information directly over serial/telemetry using pymavlink to create a ROS node that reuses all my previous code built for the ARDrone. I knew all that modular programming would come in handy some time ;) One of our major worries was implementing a reliable failsafe mechanism (and by failsafe, I imply dropping dead) and yet again, the beauty of the code being completely accessible came to the rescue again. So, when the client overrides raw RC channel values via MAVLink if the connection gets broken there’s no way for the RC to regain control of the drone. To fix this, I first ensured that I didn’t override channels 5,6 and 7, and then in the 50 Hz user hook I listened to the Ch 7 PWM values to detect flipping the switch and consequently disarmed the motors. I also set the Ch 6 slider to switch between Stabilise and Land so that I could perform a controlled land whenever I wanted to.

So, everything’s in place after a little help from the community, and hopefully I shall be following some trajectories sometime soon. With a tether, of course.

Meanwhile, here’s a little list of extra things I needed to do to get all the ROS packages working well on the PandaBoard after the basic install process. Specifically pcl_ros.

ROS packages on Pandaboard

• Install pcl_unstable from svn source
• For the #error in finding endianness, manually specify the endianness to be PCL_BIG_ENDIAN in the header file that throws the error (Crude hack, I know)
• To get pcl_ros to compile, the vtkfind cmake file is messed up. Patch with
• ROS_NOBUILDed ardrone2 since we don’t necessarily require the driver to compile. I hate that custom ffmpeg build stuff.
• ran rosrun tf for ardrone_pid
• Compiled OpenCV from source, hacked around CMakelists.txt (explicitly added paths)
• Explicitly set ROS_MASTER_URIs and ROS_HOSTNAMEs on both machines when testing out nodes

Get a grip on!

Following Rosen’s suggestion, I used the TaskManip interface to perform grasp planning, and here’s the first look at the ER4U grasping an object. Yes, you read that right, I’m getting agonizingly close!

The next thing I’m going to implement is introducing a couple of placeholders to define the start and end positions of the target.

Hopefully, I’ll also be able to add a IP based target tracking plug to later substitute for the placeholders in the simulation.

Picture n’ Pose

So, as I mentioned in my previous post, I am working on recreating a 3D photorealistic environment by mapping image data onto my pointcloud in realtime. Now, the camera and the laser rangefinder can be considered as two separate frames of reference looking at the same point in the world. After calibration, i.e. after finding the rotational and translational relationship (a matrix) between the two frames, I can express (map) any point in one frame to the other. So, here I first project the 3D point cloud points onto the camera’s frame of reference. Then, selecting only the indices which fall within the range of the image (because the laser rangefinder has a much wider field of view than the camera. The region that the camera captures is a subset of the region acquired by the rangefinder. So, I’ll get corresponding indices, say from negative coordinates like (-200,-300) to (800,600) for a 640×480 image) Hence, I only need to colour the 3D points which lie within the (0,0) and (640,480) indices using the RGB values at the corresponding image pixel. The resultant XYZRGB pointcloud is then published, which is what you see here. Obviously, since the spatial resolution of the laser rangefinder is much less than the camera’s, the resulting output is not as dense, and requires interpolation, which is what I am working on right now.

Head on to the images for a more detailed description. They’re fun!

Screenshot of the Calibration Process

So here, first I create a range image from the point cloud, and display it in a HighGUI window for the user to select corner points. As I already know how I have projected the 3D laser data onto the 2D image, I can remap from the clicked coordinates to the actual 3D point in the laser rangefinder data. The corresponding points are selected on the image similarly, and the code then moves onto the next image with detected chessboard corners, till a specified number of successful grabs are accomplished.

Screenshot of the result of extrinsic calibration

Here's a screenshot of Rviz showing the cloud in action. Notice the slight bleed along the left edge of the checkerboard. That's because of issues in selection of corner points in the range image. Hopefully in the real world calibration, we might be able to use glossy print checkerboard so that the rays bounce off the black squares, giving us nice holes in the range image to select. Another interesting thing to note is the 'projection' of the board on the ground. That's because the checkerboard is actually occluding the region on the ground behind it, and so the code faithfully copies whatever value it finds on the image corresponding to the 3D coordinate.

View after VGL

So, after zooming out, everything I mentioned becomes clearer. What this does is that it effectively increases the perceived resolution. The projection bit on the ground is also very apparent here.

Range find(h)er!

So the past few weeks I’ve been working on this really interesting project for creating a 3D mobile car environment in real time for teleop (and later autonomous movement). It starts with calibrating a 3D laser rangefinder’s data and a monocular camera’s feed which allows me, after the calibration is done, to map each coordinate in my image to a 3D point in the world coordinates (within the range of my rangefinder, of course). So, any obstacles coming up in the real world environs can then be rendered accurately in my immersive virtual environment. Now since everything’s in 3D the operator’s point of view can be moved around to any required orientation. So, for instance, while parking all that is needed to be done is to shift to an overhead view. In addition, since tele op is relatively laggy, the presence of this rendered environment gives the op a fair idea of the immediate environment, and the ability to continue along projected trajectories rather than stopping for resumption of connection.

So, as the SICK nodder was taking some time to arrive, I decide to play around Gazebo, and simulate the camera, 3D Laser rangefinder and the checkerboard pattern for calibration within it, and thus attempt to calibrate the camera-rangefinder pair from within the simulation. In doing so, I was finally forced to shift to ubuntu (Shakes fist at lazy programmers) which, although smoother, isn’t entirely as bug free as it is made out to be. So, I’ve created models for the camera and rangefinder and implemented gazebo dynamic plugins within them to simulate them. I’ve also spawned a checkerboard textured box, which using a keyboard manipulated node I move around the environment. So this is what it looks like right now

Screenshot of w.i.p.

So here, the environment is in the Gazebo window at the bottom, where I've (crudely) annotated the Checkerboard, camera and Laser Rangefinder. There are other bodies to provide a more interesting scan. The point cloud of the scan can be seen in the rviz window in the center, and the camera image at the right. Note the displacement between the two. The KeyboardOp node is running in the background at the left, listening for keystrokes, and a sample HighGui window displaying the detected Chessboard Corners at the top Left.

Looks optimistic!

Logs of a brilliant summer

June 21
Unboxed the new laptop! Yay!

Tried installing Fedora 15 x86_64 from a live USB stick - no dice. The thing didn't even boot, giving weird intel MUX INFO errors and falling to a debug shell (dracut). Tried installing Ubuntu 11.04, and that just crashed. Took it to Pras to try and identify the issue, and eventually realised that the live USB image was at fault, and the OS got installed fine using a burnt disc.

The video drivers look buggy, but it can be worked on.

 Wifi seems to be an issue - had to install the specific uname file using yum.
Otherwise I had to build the compat-wireless package, which this install avoided.

Had to manually configure the wifi using the terminal the first few times using 
iwconfig wlan0 essid "CMU"
and then playing around with the wifi applet

Had a few issues with installing Adobe flash player, and eventually used the 64 bit square player and copied it to the mozilla/plugins directory, and it works now

Tried installing ROS using the instructions. It didn't (as usual) compile nicely. Had to install boost libraries, but it still showed an error on make - specifically an example program didn't compile, and threw an error stating an undefined
"boost::filesystem3::path::extension() const"

A bit of googling ( made me realize that this was an issue with the changeover from boost::filesystem from v2 to v3, and so I had to add a #define BOOST_FILESYSTEM_VERSION 2 to all files including a #include <boost/filesystem.hpp>. To find these files, I used
find . | xargs grep 'boost/filesystem.hpp' -sl
this gave me about 9 files which I modified, and ran the script again. Success!
June 22
Tried installing Eclipse, the dump crashed whenever I tried to select preferences,

Fixed it by using the repository version of eclipse - worked like a peach - just had to add the helios packages for dependency resolution from within the preferences

Installed NetBeans 6.9 from the repo

June 23
Worked on ROS tutorials.

However rxplot did not run, giving a missing 
	ImportError: No module named backend_wxagg
error. Installed matplotlib and all wx packages and rosdep install rx, but to no avail. Seemed like a dependency issue. Used the following command
yum install python-matplotlib-wx
to get the missing dependencies from (

 The rxplot has weird synchronisation issues the plot jumps at 1 Hz and eventually messes up the entire x and y plot. The echo messages, when plotted do not show this behaviour. BUG, most definitely
June 24

Went through ROS subscriber and message and server client examples

Attended the Airboat Meeting 2
Airboat Meeting 2
 Discussed image parameters, and lens test performed by Louis. Decided that the fish eye actually gave a much wider field of vision vis a vis the regular stock camera without sacrificing too much resolution. The problem of reflection still remains, and adding a polaroid filter might add further distortions. The camera can most probably not be set in a prone position, and so it will be left standing (As of now)

Wrt the mechanical part the new foam pour has (perhaps) finally been decided, and robot city needs to be prepped up by next week - upto Captain Abhinav :) Balajee most probably won't be available till end of July. 

Discussed manoeuvering issues with Pras, but he seemed confident of the boat sailing through, especiallyas the closed loop worked pretty much perfectly on Airboat 1.
Discovered the University Centre Basement - It has an awesome 'Scotland Yard Room' which has a ping pong table, pool tables, soccer tables, a pinball arcade and a dance machine. Had fun playing on it with Remus, Mihai and Piotr

June 27

Installed rosjava all over again - this time had more directories to work with.

in ~/ros/rosjava/android
went into tutorials and library, and modified the to point to android-10 instead of (weirdly) android-9

now opened one tutorial project in Eclipse - the standard Android app way. Library paths were messed up, so deleted them, and included them from rosjava/lib

Library (and project) issues - used File->Import existing project to include ~/ros/rosjava/android/library and ~/ros/rosjava/rosjava Issues decreased significantly

Need to set nodeConfiguration.setHost.....
and set rosCore.createPublic

So, by statically assigning the nodes to run on the device local IP address, and the roscore master to be running on a nother machine in the network, we can get ros working :D

roscore running on phone is not a full implementation as yet - cant interact with it yet - can only see published things


Also attended a MAV meeting where Sebastian presented a draft of his presentation on river mapping and navigation on MAVs. He uses a pretty decent approach to find out what region in river, and uses LIDAR to scan the banks. He tracks the path plotted by taking into account the GPS, IMU and Optical Correlation through a Globabl filter. The Optical Correlation shows a significant drift, which is corrected by the other two. The processing is done on board with 4 ARM dual core processors in parallel with a DSP

June 28
Started working on establishing message framework for rosjava implementation. Filled the board with the decision alternatives.

Found out that the data transfer between nodes is NOT done by XML RPC - XML RPC is used just to establish connection between the two - the data is serialized and sent over using in built methods over TCP or UDP

Started Eclipse, to get weird Java compiler issues that it required .class compatibility set to 5.0. Seems that something, somerwhere messed with the default settings, so had to manually set the Java compiler version in Project properties to 1.6 The code compiled then. (Probably because of the running JDownloader?)

Observation: Changing screen orientation resets the code. Probably because it onPause()s the app on changing orientation.

Discussed the implementation of the Airboat server with Pras. So the way forward is to create  a barebone implementation of an interface  (preferably) or an abstract class library of rosjava precompiled within it, so that this library could be implemented in any java project. Specifically, the implementation of this library for a real airboat will declare how, say, setWaypoint() works by selecting a controller and configuring it using PIDs. However, for a simulation, these methods could be declared so as to only move in a cartesian space on the GUI, or however the simulator wants to move the boats.

Decided which topics were to be selected as topics that are broadcast and which topics are to be implemented as services. Not everything needs to be published. The TCP or UDP implementation is done by ROS on its own. 

Higher order functions like setAOI(Area Of Interest), setNewThresh(To set the 'newness' threshold for the image) and airboatStates (Which stores the state of the other boats) are not crucial to the core functionality and can be implemented with the boat control implementation of this bare bones library.

June 29

Tried running the different tutorials on the android app. Figured out that with roscore working on my laptop the service on the phone couldn't communicate due to the firewall. Watching for topics on the syrah server worked, but no image was forthcoming when I used the image_transport tutorial. Will figure that out soon.

While trying to show this, realised that I needed to rosmake image_view. This (almost immediately) threw errors (how predictable)
Installed jasper, ffmpeg and GraphicsMagick -devels and managed to rosmake libplugins. 
Also had to install libuuid-devel

Howeverthe build started failing with very similar errors to those encountered on June 21 with the boost libraries used in the ROS code being deprecated. Performed the same manouever by editing the boost_fs_wrapper.cpp

Old friend OpenCV2 then started playing up - it threw a whole host of ptrdiff_t errors. Fixed it by editing build/opencv-svn/modules/core/....core.hpp and including <cstderr.h> Voila! Build complete! :)

sudo yum groupinstall "Development Tools" "Legacy Software Development"

June 30
Finally managed to view the Android camera using one of the tutorials in /android/ It was pretty cool. Realised that image_view required a few dependencies that hadn't been compiled - theora and compressed.

Also, the code did not follow the standard ROS convention of the  compressed image being published to camera/compressed and published it to /annonymous/camera instead. Hence fixed this to publish it to /camera/compressed and voila, the image could be viewed from any node.

Now all that is required is to implement this with actionlib

July 1

Attended the airboat meeting where we discussed results of the imagery tests performed by Pototo using the curved mirrors. Also discussed the mechanincal component of design with Chris and Abhinav. The foam material seems to have been finally chosen, and the acrylic base of the prop needs better bearing support as it often interferes with the makeshift channel that Chris carved out for it.

Managed to implement the fibonacci actionlib example by running the actionlib server on the android app and listening to the broadcasts using the stock SimpleFibonacciSequenceClient app after pointing it to this server.

All that needs to be done now is to link the airboatServer components in it.

July 7

Worked on the 

Jul 8

Airboat Meeting 3

Discussed specifications for the test to be conducted at the irrigation lake near Baltimore

Primarily, the plan is to perform sensor tests - pH, conductivity and temperature over the 120x60 m pond. The plan of action is to decide the lake into a grid of 5x5m sample patches, and proceed with further navigation - three methods have to be implemented

1. A Random Walk/ Lawn mower
2. Prefer Highest std. Dev./Err.
3. Prefer high gradients

So the work to be done is
Paul: Autonomy setup
Abhin: Sensor Calibration
KSS: Control + Image connection
Louis: Bank avoidance

Also discussed about the requirements of the boat to stop and possibly rotate to point towards a specific object or direction (specially for capturing imagery) 

Discussed about the depth of the sensors required, and if  a retractable sensor rod could be used.

Camera position has to be decided, still. Pras plans to have a  quad on the boat as well, so might occlude the view. However putting up a phone that way up poses issues.

July 11

Finally started working on actionlib message generation - realised that I needed to generate a .action file. So used a pose for Goal, and two Vector3s, etc. as per the requirements. Didn't implement the Camera straightaway - that's deferred for later

Generating the messages was harder than expected. Trying to use the .action in own project just didn't generate anything! Anyway, copied it to actionlib_tutorials, but only to be greeted with compiler errors stating that  package org.ros.message.geometry_msgs does not exist was not available :S

A better look into things leads me to the conclusion that the python script may not be at fault in this respect, and that it does not need to be modified. However I had to modify a few source files.
The issue that occurs is that during compilation of the roscpp.jar file, issues are generated. Something like this

  [javac] home/kshaurya/.ros/rosjava/gen/srv/roscpp/org/ros/service/roscpp/ package org.ros.internal.message does not exist
    [javac] import org.ros.internal.message.Service;

Eventually I realized that the test_ros messages were not being generated, and so I had to manually generate the message and service files to ~/.ros/rosjava using the python scripts, change their path to point to /~.ros/rosjaava/gen/org.....

AND modify the three roscpp files (Empty, GetLoggers, SetLoggerLevel) to import org.ros.message.Service rather than org.ros.service.Service.

Fixed these to get things working again. In addition Pras has added a crw-cmu thing too, so I guess things are set now.

With regards to the architecture, multiple actionlib servers shall be set up - one for say pure waypoint navigation, another for camera, etc.

July 12

Managed to mess up wifi earlier in the day by uninstalling NetworkManager. Had to reinstall it using a rpm downloaded off a separate computer.

Finally started working on the actionlib implementation. Created a simple .action file for waypoint navigation, and included it in actionlib_tutorials. Regenerated the jar files, and included it in the crw-cmu  project. Also added geometry_msgs, etc.

Mostly copied the fibonacci example, omitting unnecessary stuff.

Got access to the repository (yay!) and pushed my commits. Pras later fixed some of the stuff and the autogeneration code.

July 13

Played around with the callback functions. The sac.watForResult method doesn't seem to work that well (perhaps there's some issue in its implementation)

Talked to Pras, and realised that ActionLib was to be used for setting the *configurations* and not for accessing/polling data.

So the way it works now is that the VehicleServer interacts with the outside world via actionlib for config messages such as set camera on at so and so params, stop taking pictures, go to so and so waypoint, stuff like that. The publishers take care of sending out the data streams as such.

July 14

Implemented a dummy waypoint controller using actionlib. I shifted the separate RunActionServer and RunActionClient into a single SimpleBoatSimulator class as launchServer and launchClient methods. Committed the code.

Pras said that we could possibly get this stuff now working on the boats in a couple or so days.  

July 17

Sat down with Pras to set up the Boiler plate code to encapsulate all the wrapper classes into a single class. 

July 18

Fixed the bootup issue on the laptop by disabling the sendmessage service. Bootup times dropped from ~3 mins to 13 seconds! XD

Also fixed the RosServer calls for multiple actionlib instances. The issue was that each actionlib instance was using the same /status, /goal, /result ,etc. topics as they were assigned to the default (root) namespace. To fix this created a separate Node configuration for each of the the buildSimpleActionServer instances and assigned them a different node namespace by using the NameResolver class
Airboat Meeting 4

Saw the latest working prototype. Decided on the date of the visit to the irrigation pond. George was present too, and he mentioned some new low cost small form factor integrated sensor that a company was intending to launch soon.
Also decided that shelling about $70 more would get us new Google Nexus Ss which had a 3 axis gyro integrated, which saves the trouble of purchasing an external IMU at the same cost, with added integration issues.
George said that the sensors that the scientist at the irrigation pond uses are attached to buoys shaped like cowboy hats. The water recirculates through all the fields, and hence they really need to measure the water quality at all times to ensure that no pollutants (or pathogens) enter the water stream. 
There was also talk about some BOD thing, don't really remember what it was exactly.

'Special' Talk/Seminar
Dental, Aircraft assembly

Devices : Omega, hantom Desktop, Haption 6D, Delta
Can be serial or parallel

Back drivability, singularity problem - require a task space without a singularity
(Gosselin, 90)

ax + B Theta = 0
Internal singularity => Det A = 0

To evaluate how far away from singularity, Gosselin, Pottmann, Kozak - a physical representation, rather than a mechanical one

Voglewede, et al

So the deal is to narrow down the singular subspaces to those created by the Force applied and those by the moments. Once the two subspaces are obtained, the complement of their union is the 'safe' workspace to implement. (the union is called the Pseudo singular space)

To determine these parameters Mf and Mm, we use Rayleigh quotients

So, there were two hinged joints on each side of the trainguular base - spherical joints 4 bar linkage to give a 6 DoF end manipulator - movement and twist

Hao Li, Yuri Zhang, IEEE 2011.5

No issue with low frequency human jitters ~7Hz. Devices are working at 1 kHz....:S
Umbrella Based Texture Bombing

Create carpetted textures - mimic diversity with limited seed vals

Related works - texture bombing, Diversification, Level of Detail,

Create a polygon from a characteristic shape to create an umbrella, and then modify the shape to create modified umbrellas by morphing the end point s of the polygons generated - and the colour can be changed as well - so morphological diversification

Divide the space into grids, and map the umbreallas such that atheast a grid square overlaps (mipmapping for places where high LoD is not required)
 So for things further away, you need to represent it by lesser object if possible
To do this they cluster using K Means,  find the convex hull, and merge the overlapping umbrellas, blending their individual colours
This works as umbrella footprint is really small
To make things more realistic, add shadows using pose information, and lighting using knowledge of global lighting.
Allows smooth rendering in real time.

Brain Computer interface thesis

Samuel Clanton

Kleins motion control (impedance velocity controller)

7 D decoding model

arrays inside cortex
Spike rate = delta V / time

R,P,Yaw not vectors per say, so angular velocities  used

how to calibrate decoding model? 
	a. Use the model on monkeys arm, and then use that
	b. Use observation data - the monkey saw humans do it, or it tried random things till it figured out what to do

no better hemisphere for control

Impedance Vellocity controller
Kleins controller - spring+damper attached robot hand

However the monkey doesnt always think about moving the hand
so an active shared control system used - mixing operator command and auto command which gave perfect commands as per on robot configuration
LAter used passive - which attenuates portions not positively spanned
so these work as guiding mechanisms

So to get the monkey to learn very well the error admittance is gradually reduced to make it less and less guided

The monkey started using the force feedback  to align the robotic hand quite often instead of actually trying to rotate it to orient itself to the target  the feedback is visual.

Airboat Meeting 6

Tried to analyze the spinning in the software crash

Taking three boats, implementing two. Couple of extra hulls. Electronics for three boats

The EC measurement is instantaneous, but temp and oxygen might have transience

Getting the boat to drivve around and GET measurements is omre important as an issue

Need to perform range tests

An autonomy toggle will be required to cross the linie between the buoys and resume, so the boat just realises that it has 'jumped' to the new pos - no need to restart

Pras talked about calibration. Need to figure out the horizon line deal, even if its constant.

1. ROS Tutorials
2. Make rosjava
3. rosjava tutorials
4. Need to figure out messages in ROS for interfacing
4.1 Try out ActionLib
5. Pure Java library to blackbox all the ROS stuff - a wrapper of sorts
5.1 Fix repository layout, and use crwlib jar 
5.2 Work on Image transport code
5.3 Work on figuring out RosServices

July 25
Had issues with the wifi again. Hopefully fixed it by lowering the frequency using iwconfig to 2.462 MHz (I think thats g speed)

Pras fleshed out the interfaces. A Test package has been created to test out stuff. A few messages were changed, and so had to delete archaic entries.

Tried to visualise stuff using rviz. Realised that would

July 27
July 28

Had to make flyers for the RISS thing

Talked to Paul, and started working on implementing VehicleServer for his code

July 30
432254.86, 4371539.65, 0.00 18 north

432254.86, 4371539.65, 0.00 18 north

base  station
432227.44 4371540.67

432206.243    4371541

432246.1 4371543

Right bank, green bush
432248 4371589

432248 4371544

 Aug 1

Testing in schenley

It is heading correcctly, but significant GPS drift
Goal is sent repeatedly ad infinitum. Couldn't see completed on the GUI, whereas the controller DID stop on reaching the waypoint.

Have to clcear curret waypoint on achieving completed flag.


Phone had to be yawed perpendicular to the box axis

August 8

Talked to Paul about further work this week. He mentioned that I could work on one of two things- the controls or the IP based suff. So I decided to do something on the image queing methodology.

Paul said that in doing that I could possibly economise on the images that I send, for example sending images only when the boat changed position significantly - so for example a yaw, or after moving say 10 metres - he would then implement an interface for that, and that we could write a paper/something like that that on that.

Airboat Meeting x

Go with three copies of the best design

similar shrouds, with minor tweaks

Temperature sensor at the hardware level
add it as a sensor channel

Going to try a heatsink attached to the ESC before doing anything else

Performance eval criteria ---

Contact info on the boat

Behaviour for comm losses - WHEN to turn around...
Hence easiest way is getting a go home app, which overrirdes the existing setup

Restart remotely

Talk to brad for obstacle avoidance

Queue up packets in spotty comm

Add connectivity parameters in sending in data (good connectivity => send more)

Observations from discussions with Pras on the Tests

1. Weird Network bug, when topics did not show up on my roscore, but did on connecting Abhinav's - very probable that it's a network bug

2. Possible boat interactions - How to broadcast current state to other boats

3. How to implement novelty basis for taking pictures that are situationally aware - first elemental step  would be to take pictures when the boat changes it orientation or position. More intersting ways to go about it will possibly using 
a. A geometrical based pie sector based mapping
b. A grid based mapping
where the controller tries to take a path to goal that will actually acquire the maximum number of images along the path.
Such a navigation methodology differs from a path finding methodology in that the future path relies on the path previously taken to reach a particular point - you really dont want to keep moving in a straight/shortest path in which you, say, keep looking at the goal all the time.
4. A debug button to start a new log every time the operator wants one during deployment 

5. Integrating Pototo's code - consider the image and the obstacles as a pseudo laser scan, with distances from the base to the FIRST obstacle converted to metric distances and published to a controller. We could publish to cmd_velocity, with an autonomy controller sitting around pretty, and an external publisher doing the dirty stuff for the mean time.


Testing the bed

In order to work on the recognition algorithms I realised that I didn’t have enough (read I had none) real world image data to train/classify the character recognition code. So I built up a test bed that grabs a character from the database that I downloaded (I think it was the NIST alphabet database), superimposes it on a coloured shape (which was restricted to squares and circles, the latter being worked out using a negative (mask) image), rotates this target at an arbitrary angle and scales it down to a certain ratio and places this ‘target’ randomly across a generic aerial field picture. The test bed works pretty well, and I utilized the same (coarse) code to gather the letter image and/or the contours to be fed to the classifier.

Having done the test bed, I moved on to recognition tasks, that I naively assumed would only take some time to get their act together. I had read up a lot of papers, and the most straightforward and promising approach appeared to be utilizing Hu moments. Hu moments are essentially seven numbers that are algebraically derived from simple geometrical moments of any image in particular with the special property that these numbers represent the shape invariantly to scale and rotation. To make things even better, OpenCV has built in methods to determine Hu moments of an image, and compare the Hu moments using four distance measures (statistical equations which determine the ‘distance’ between compared values)

So, I set up a folder with nicely set normalized alphabets and compared the Hu moments of the extracted letter with the hu moments of all these reference letters and assumed that the letter with the lowest difference should be the matching letter. Pretty straightforward, right? Unfortunately for me, it is definitely not the case, and letters hardly gave correct results and for slimmer letters, the results went haywire, with L and Y taking away almost all the matches. Only R behaves well. Here’s a snapshot of the test bed in action:

A screenshot of the letter R in the bed

Turns out that Hu moments aren’t complete descriptors, for example, they can’t distinguish between a normal pan and a pan with two diametrically opposite handles.

Disappointed, I moved on to the very exciting field of Neural Networks, and the promises that they held in store. After some studiously banging my head against the OpenCV documentation (somebody *really* needs to work on it. It is haranguing! –update: I sincerely hope the intern at GSoC does a good job!) I finally managed to get something working – the code didn’t throw up errors. I saved the file and got thousands of lines of coefficients – which seemed like what the network should have looked.

How I approached the Neural Networks was thus: I selected a letter for each letter from the database and then got 36 images by rotating the letter progressively by 10 degrees through 360 degrees(thus I had 26*36 images in all) Now all these images were already normalized and centered, and I scaled them down to 32×32 px. Hence, my design for each neuron was this – 32×32 input nodes, a hidden layer of 100 nodes, and an output layer of 26 nodes, with the value of each node representing the probability of the input being the letter corresponding to that node. I used OpenCV’s implementation of the multi layered feed forward perceptron (ain’t that a fancy ((and intimidating)) name?) and trained the network using back propogation, again implemented in the CVANN_MLP class.

I had no idea if my implementation was working correctly or not, but the ‘results’ sure were way off. I contemplated using Hu moments as inputs, but then again, the descriptors themselves were not trustworthy in the first place. Seeing no way out through the impasse, and my growing desperation to get recognition working to meet the already missed deadlines I started looking into other methods. Zernike moments became the new ‘it’ thing, but due to paucity of time, I had to abandon pursuing them and tried a couple of other techniques in the meanwhile. (In case you’re wondering, K-NN was not even considered this time around)

I tried using a clever approach by Torres, Suarez, Sucar, et all which involved making concentric circles from the centre of mass of each image and counting the number of white to black transitions, which will not vary with scale or rotation. However, due to the small size of the letters and the resulting inevitable artifacts made this method very unreliable, and led to its consequent shelving. Here’s a screenshot of that in operation (I made it pretty nice and colourful. Notice the circular code recognizing it as a T. Nods disapprovingly)

It really circles

Another promising method was from an ICDAR paper which used rotated images of the characters (just like I did for the neural network training) to build up an Eigen space and then come up with an Eigen face, of sorts, which when used to compare with a given image vector, would recognize BOTH the letter and the orientation. However, the mathematics seemed a bit dense, and the recent decision to stop working on recognition led to it’s abandonment.

So, like I just mentioned, we discussed and decided that the focus of the team should be to be able to present the required ‘actionable intelligence’ to the competition judges, and so we need to complete our GUI, segmentation and acquisition tasks to get a minimum working model ready.

It’s been quite an arduous task, but in all fairness, recognition itself is not required to be autonomous. Sigh.

Bugling the Beagle

The BeagleBoard’s here! Finally! So, after a day of messing around with the board, and trying to figure out how it worked, I am now, typing this very first blog post from within the Angstrom distribution loaded on the board! Woot!

In other related news, we had our first flight test this Saturday. The video feed was pretty good at 640×480, and the targets held up pretty well against the wind, contrary to ahem people’s expectations.Got a decent image of the star target, but the image quality in general was very poor. On switching to 1280 the video feed started lagging, and there was an issue with the antennas as well, so couldn’t test that thoroughly. However, it’s pretty evident that we need a better camera to get those stills. The Axis camera is just not making the cut – We couldn’t get it to focus at ~200ft, there is a LOT of spherical aberration, and the resolution wasn’t acceptable either. So, most probably we will be employing a still image camera in the next flight test, when we’ll couple the beagleboard to the camera.

One of the better captures on flyby

So, back to the Beagle Board.

Now, the very first thing was setting up minicom. That was pretty straight forward, and following instructions on the wiki, managed to get the serial comm working. Now the next part was checking the functioning of the board. So hooked up the null cable to the board, and connected it to a mini usb cable, and saw an entire boot up process, that eventually led me straight to a linux terminal (angstrom) over the minicom terminal. Encouraged by the result, I tried running it again with the display connected, only to be greeted by a Kernel Panic, and subsequent hung Uncompressing Linux… dialogs.

So, procured the MLO, u-boot and uImage files along with the Angstrom tarball from the Angstrom website. Formatted the SD card in the boot and ext3 partitions, copied the requisite stuff. Put everything together again and voila!

Points to be noted, then

  1. The default screen resolution is a VERY garish 640×480. It’s pretty exciting to look at initially, but is not workable. So, to go around this, after much searching, figured out that it is at the preboot stage (when the uLoader asks for a keypress to stop autoboot) that we assign the dvimode to the resolution of our requirement. So, it means a simple boot.scr (edited in vi) containing
    setenv dvimode 1024x768MR-16@60
    run loaduimage
    run mmcboot

    and you’re done!
  2. The SD Card reader jackets (the micro to SD card converters) are VERY unreliable. DO NOT trust them. Ever. Go ahead with the much simpler and reliable netconnect modems. If obtaining junk characters, check to see if the COM cable is tightly attached, and that the SD card has the MLO file, the uImage and u-boot.bin file in the boot partition.
  3. Plug the HDMI to DVI cable before plugging in the power. Also, get a power supply of 5V, and around 2A. An old Axis adapter fit the bill perfectly. Also plug in peripherals before plugging the power. The mini-USB cable is not really required then.
  4. Connecting the board to the network is easy enough. In the network connection applet, set the IPs manually, and set IPv6 to automatic. That gets the internet working.
  5. #beagle is your friend on freenode.

Now, as the beagleboard is up and running, the next task is to get opencv (and consequently the code) working on it. Hm. Also, will probably be looking at building customized boot images of Angstrom. Let’s see over the coming days.

And it’s done!

Yes! Finally! I got the code working on linux!

-Drum roll-

Had to configure Code::Blocks. There were minor hiccups in configuring Code::Blocks, name ly the include libraries. How I managed to get the code running was by

  1. Creating a new console project in Code::Blocks
  2. Going to Project -> Build options, and in Linker Settings, added the whole gamut of library files (in the ‘Other Linker options’). For the sake of completeness, they were -lopencv_core -lopencv_imgproc -lopencv_highgui -lopencv_ml -lopencv_video -lopencv_features2d -lopencv_calib3d -lopencv_objdetect -lopencv_contrib -lopencv_legacy -lopencv_flann
  3. In the Search Settings, added /usr/local/include/opencv/ for the compiler tab, and /usr/local/lib/ for the linker tab
  4. The next step involved copying all my source files and headers in the project directory, and including them in the project. And that’s it!
  5. EDIT: That, apparently, is not it. To locate the shared libraries the LD_LIBRARY_PATH environment variable needs to be set to the path of the opencv lib directory –  export LD_LIBRARY_PATH=/usr/local/lib/

So, with that done finally, we can move on with

  1. Porting this code to the BeagleBoard/SBC
  2. Further development work, most notably getting the letter and shape recognition neural networks working. That shouldn’t take too much effort – the new interfaces can be explored.
  3. Updating the code to C++ according to the new framework. Now that would involve considerable (re-)learning.

And, here is an output of the code on the Raven logo! (Yes, loads of work is unfinished. But things are looking good! )

The first output of my code on Linux!