Tuesday, 20 October 2015

Matlab: Match elements of a column in two arrays with different number of rows and move data from one to the other

Problem:

    For example, there are 'Name', 'Count' in inventory tables Sample1, Sample2, Sample3. Table Sample1, Sample2, Sample3 share some common variables of 'Names', but with different number of  'Count'. Now we want to combine all three tables into one table T_sum, in which 'Name' is a unique list and in the row of each name have the count for each sample.

Eg. from the many of the following table

Name Count
Bread 234
Pop 123
... ...

to a summary table like the following
Name_all Sample1-Count Sample2-Count ...
Bread 234 236 ...
Pop 123 456 ...
... ... ... ...

Solution #1:  use for loop

%% find the list of Names
Name_all = unique([Sample1.Name; Sample2.Name ...]);

%% do a for loop to migrate data from individual tables to a summary table
% preallocate the memory for the summary table
Counts = zeros(size(Name_all, 1),1);
T_sum = table(Name_all, Counts, Counts, ... );

% set table column names
T_sum.Properties.VariableNames(2:end) = ['Sample1-Count', 'Sample2-Count', ...] 

% for loop for coping data
for  jj = 1: N_samples
    for ii = 1:size(Name_all, 1)
        % find the index of the Name in Sample{jj}
        tmpIndex = Sample{jj}.Name == Name_all(ii);
        % if there Name_all(ii) is in the list of Sample{jj}.Name, copy Count to T_sum under the column of Samplejj-Count
        if sum(tmpIndex) ==1
            T_sum(ii+1,jj) = Sample{jj}.Count(tmpIndex);
        end
    end
end

This code is good for small set of data (<10 a="" becomes="" br="" comes="" data="" it="" large="" of="" set="" slow.="" to="" very="" when="">

Solution #2:  Vectorization (faster)

Same as the codes above for initialization and preallocation. We just need to replace the for-loops to the following codes:

%% use vectorized indexing to replace the for-loops

for jj = 1: N_samples
     % use ismember to find the order (locations) for each name in Sample{jj} that appears in Name_all
     [tmpIndex, tmploc] = ismember(Name_all, Sample{jj}.Name);
     % delete zeros in tmploc
     tmploc(~tmploc) = [];
     % Copying counts to the Summary table
     T_sum(tmp, jj+1) = Sample{jj}.Count(tmploc);
end


Now, it is much faster.

Please leave a comment if you have any question.




Wednesday, 6 May 2015

R Loading xlsx problem

require(xlsx)
Loading required package: xlsx
Loading required package: rJava
Error : .onLoad failed in loadNamespace() for 'rJava', details:
  call: fun(libname, pkgname)
  error: No CurrentVersion entry in Software/JavaSoft registry! Try re-installing Java and make sure R and Java have matching architectures.
Failed with error:  ‘package ‘rJava’ could not be loaded’


Problem answered by http://stackoverflow.com/questions/17376939/problems-when-trying-to-load-a-package-in-r-due-to-rjava

The reason is probably linked to the fact you are using a 64-bit OS and R version but do not have Java installed with the same architecture. What you have to do is to download Java 64-bit from this page: https://www.java.com/en/download/manual.jsp After that just try to reload the xlxs package. You shouldn't need to re-start R.

Friday, 1 May 2015

how to change the alt-tab application switcher in ubuntu unity

Original post is from here : http://ubuntuforums.org/showthread.php?t=2211863



sudo apt-get install compizconfig-settings-manager
sudo apt-get install compiz-plugins

Open compizconfig-settings-manager with alt-F2, type ccsm.

Scroll down to "Ubuntu Unity Plugin". Choose the tab "Switcher". Disable the alt-tab and shift-alt-tab key bindings. ("Key to start the switcher" and "Key to switch to the previous window in the Switcher".
Click the "Back" button.

Scroll down to the "Window management" section. Here you can select another switcher.
I enable the "Static Application Switcher", resolve any potential conflicts by setting the setting for "Static Application Switcher".
Now you can tweak the switcher by clicking on it. I have changed alt-tab and shift-alt-tab to "Next window (All windows)" and "Prev window (All windows)".

Experiment to find the settings that work best for you. It you want to go back to the poriginal settings, simply disable the Static Application Switcher and enable the key bindings in "Ubuntu Unity Plugin" again.

Thursday, 16 April 2015

Find the maximum and its coordinates in a region of a matrix

%% Select the region of interest
Subregion = A(rowStart:rowEnd, colStart:colEnd);

% find max value and get its index
[value, k] = max(Subregion(:));[i, j] = ind2sub(size(Subregion), k);
% move indexes to correct spot in matrix
i = i + rowStart-1;
j = j + colStart-1;

Original link to the solution is http://stackoverflow.com/questions/7677996/matlab-finding-max-value-in-a-region-of-2d-matrix

Friday, 23 January 2015

Memory about Dr. Stephanie Grothe in work

I was very sad to hear about the tragedy happend at Joffre Peak, B.C.. It was a hard period for me to lose such a good working partner: Stephanie Grothe. Many people had tell stories about how good she was in life. Here I would like tell you how great a person she was in work.


Steph and I had together in a few research projects with scanning tunneling microscope (STM), starting back in May 2011. 

At the beginning, when I was a newbie to STM. Steph, who was an expert, taught me all the operations in two weeks. She then said to me that “you are fast learner”. But I knew deeply in my heart that, without Steph’s great patience, accuracy, and expertise, I could not have acquired the knowledge so fast. She was a great teacher.

During the two years period working together, we had many successes. A crucial factor leading to the successes was her rigorous attitude towards to research and work. Our STM is like a spoiled baby. If one makes a tiny mistake, our STM would refuse to cooperate for weeks or, sometimes, even months. However, this baby was always happy under Steph’s “babysitting”, and with her, there was no operation mistake within two years. Additionally, Steph wrote a manual with all the details for operating the instruments, and it has becomes a  bible for everyone who is working on the STM. 

Another crucial factor leading us to the successes was Steph’s a positive and optimistic personality. In our research project, a single atom change can violate the measurement. Usually a good result comes after many failure trails. Therefore, one can easily become depressed and encounter feelings of wanting to give up in the middle of his/her projects. In our projects, Steph had never lose the patience even after many failures in months.  I was strongly influenced by her positive and optimistic attitude towards to our projects. We supported and encouraged each other in our projects, and solved the problems together. Finally, our efforts yielded four publications in peer-reviewed journals.

Beyond her brilliant her brilliant personality and remarkable expertise, she had a special talent of visualization. In our research projects, we often had to dive into a large set of data to discover new physics. Steph was extremely good at interpreting the data using graphs, in particular using colors. In our publications, most graphs were designed by her: they are beautiful and just right to show the physics. If you want to see her gift of visualization, you can see her publications as examples: http://lair.phas.ubc.ca/person/stephanie-grothe/

I felt extremely lucky to have worked closely with her at UBC. I learnt more from her than she from me. In particular, her rigorous attitude towards to work and life, and her positive and optimistic personality have profoundly influenced me for the better. She will be missed, and is being missed. 

Wednesday, 21 January 2015

Pandas In Ipython: Basic Plotting in Ipython notebook

As a beginner to Pandas, I post my learning curve to use this package.
First, I would like to summarize some basic and essential command lines for plotting a 2D graph.

When openning an Ipython notebook, one can use the following command:

ipython notebook --pylab

Which automatically import pylab.
Then import pandas:

import pandas as pd

read data (.dat) file using:

data = pd.read_csv('directory+filename', header = 12)

Here header the number of rows of describing text before data in your .dat file.

to view the data or data information, using some of the following commands:

data                    #show data
data.head()        #show first a few rows
data.tail()            #show last a few rows
data.keys()          #show the key words for each column
data.info()           #summarize columns
data.dtypes()       #show the data type of each column
data.describe()    #quickly gives mean, std, min, max values of the numberical columns

if you want to see all the data in a table, instead of only a few rows:

pd.options.display.max_rows = 1500 # if you have 1500 rows
data

you will see all data.

If you want to delete a redundant column:

data2.drop( 'column name/key', axis = 1, inplace=True)

Now, we want to plot the numerical data:

fig=figure()
# fig.clf()      #used for clear previous plot in the same figure
ax = fig.add_subplot(1,1,1)
plot_No1 = ax.plot(data.Temperature[0:186], data.Resistance[0:186], linewidth = 2., label=r"0T")
plot_No2 = ax.plot(data.Temperature[187:231],data.Resistance[187:231],  linewidth = 2.,label=r"1T")
ax.legend(loc = 2)    # show legned, with loc(ation) at  2 (up-left)
xlabel(r'Temperature (K)', fontsize = 20)
ylabel(r'Resistance ($\Omega$)', fontsize = 20)
ax.tick_params(axis='both', which='major', labelsize=15)
savefig('fig', transparent = True)


These commands can help you to quickly visualize your data.
I will explore many more advanced features in the future.