Data organisation in experiment with multiply variables

Here we discuss using the GenEx Data Manager

Moderator: MultiD Support

Data organisation in experiment with multiply variables

Postby Mathias » Tue Nov 17, 2009 9:36 pm

I have been trying to implement GenEx software in multiply variables analysis, but it seems to bring a lot of difficulties for me, especially the proper way of data input (organisation) to perform full analysis of all samples.

I went through tutorials on www.multid.se and everything looks quite easy when having data for only two experimental groups (control and treatment). However things get complicated for me when I try to analyse more sophisticated experiments.

Let's say that my experiment contains data for 10 genes of 2 tissues for 5 individuals in 3 experimental groups (control, treatment1, treatment2) and 3 different timepoints (0, 24, 48 h after treatment). That is a lot of variables. Is there a possibility to perform a complex analysis of all of them? or shoud the data be analysed seperately (independently)? I would appreciate for receiving any tips according to the data organisation in this case.

Thanks in advance.
Mathias
 
Posts: 1
Joined: Tue Nov 17, 2009 9:03 pm

Re: Data organisation in experiment with multiply variables

Postby Anders Bergkvist » Thu Nov 19, 2009 2:27 pm

Hi Mathias,

We distinguish between exploratory and confirmatory statistical studies. Central to any statistical study is the hypothesis. An exploratory study is performed by browsing through the data with the aim to propose a hypothesis. A confirmatory study starts with a well-defined hypothesis and aims to confirm this hypothesis based on strict pre-defined criteria.

Your data set has many dimensions to it and that is an inherent complexity. I assume you dont have a specific hypothesis you want to confirm, but that you rather intend to use this data set for an exploratory study. There are many ways to arrive at proposed hypotheses. I would suggest you use many different visualization, clustering and partitioning techniques to get a good overview of your data. In the end it is up to you as an analyst to propose hypotheses based on your impression of the data.

Yes, it is possible to analyze all the variables at the same time. See Ståhlberg et al., BMC Genomics 9:170 (2008) for reference. This paper illustrates how you can incorporate several variables in your data matrix by lamination or catenation. It also shows how you can use both colors and symbols in plots to efficiently visualize many aspects of the data simultaneously. And yes, you can also (and I would recommend it) analyze the data separately. This may yield additional hypotheses, thus allowing you to extract more information from the data. For both combined total and separate subsets of the data I would recommend you start by looking into hierarchical clustering, PCA and SOM.

Best wishes,
Anders
Anders Bergkvist
 
Posts: 31
Joined: Wed Jul 02, 2008 9:06 am


Return to Data Manager

Who is online

Users browsing this forum: No registered users and 1 guest

cron



MultiD Analyses

Home of the GenEx analysis software




Partners



































www.Gene-Quantification.info