Questions tagged [machine-learning]

How can we build computer systems that automatically improve with experience, and what are the fundamental laws that govern all learning processes?

From The Discipline of Machine Learning by Tom Mitchell:

The field of Machine Learning seeks to answer the question "How can we build computer systems that automatically improve with experience, and what are the fundamental laws that govern all learning processes?" This question covers a broad range of learning tasks, such as how to design autonomous mobile robots that learn to navigate from their own experience, how to data mine historical medical records to learn which future patients will respond best to which treatments, and how to build search engines that automatically customize to their user's interests. To be more precise, we say that a machine learns with respect to a particular task T, performance metric P, and type of experience E, if the system reliably improves its performance P at task T, following experience E. Depending on how we specify T, P, and E, the learning task might also be called by names such as data mining, autonomous discovery, database updating, programming by example, etc.

3322 questions
0
votes
0 answers

Exercise $3.19$ of Sutton's Reinforcement Learning

I'm trying to solve the first part of exercise $3.19$ of Sutton's Reinforcement Learning (Chapter $3.5$, page $62$, second edition). The question reads: The value of an action, $q_{\pi}(s, a)$, depends on the expected next reward and the expected…
0
votes
0 answers

Support vector machine - SVM - calculation on paper

I have a problem with SVM. I tried to solve this but it doesn't work. I followed a guide on this link, but I'm lost. Okay, the problem: I have dataset $D = {((0,0),-1),((2,4),-1),((4,2),-1),((6,4),+1),((6,8),+1),((8,8),+1)}$ I need to find a hard…
0
votes
0 answers

Definition of negatives in NT-Xent loss

I'm trying to understand few details about NT-Xent loss defined in SimCLR paper(link). The loss is defined as $$\mathcal{l}_{i,j} = -\log\frac{\exp(sim(z_i,z_j)/\tau)}{\sum_{k=1}^{2N}\mathbb{1}_{[k\neq i]} \exp(sim(z_i,z_k)/\tau)}$$ Where $z_i$ and…
James Arten
  • 1,953
  • 1
  • 8
  • 20
0
votes
0 answers

What is the mathematical explanation behind a larger RBF kernel parameter resulting in a more linear decision boundary in SVM?

I know that as the kernel parameter for the RBF kernel increases, the Gaussian function becomes less peaked and broader. The reach of the points become larger meaning that farther datapoints have more weight. Intuitively, it makes sense that a…
0
votes
0 answers

Why does regularization have an effect in linear classifiers?

I'm struggling to understand how regularisation, for example using the l1 or l2 norm, has any effect on linear classification problems. If we have a simple binary classification task where we are trying to find a weight vector $w$ to classify a…
Tommy
  • 13
  • 4
0
votes
0 answers

Tying to show that the weighted error equal 1/2 using adaboost?

Consider an ensemble classifier constructed by T rounds of AdaBoost on N training examples. \begin{align} H(\mathbf{x}) = \sum_{t= 1}^{T} \alpha_{t} h_{t}(\mathbf{x}) \end{align} The next classifier: $h_{T+1}$ is added to the ensemble, by minimizing…
0
votes
0 answers

How should I normalize the grade of a movie by users ( y)?

I have two datasets.The first one contains the grade of users mark for a movie(y). The second dataset shows that if the users have rated the moive or not(r) ( if the users have rated the movie => r =1 , otherwise r = 0). I don't know how should I…
bento
  • 1
0
votes
0 answers

what is the derivative function log(f(x+y)) with respect to x?

I have the function of alpha + beta log(gamma((alpha+beta))), I want to calculate the derivative this function with respect to alpha. I need it as a derivative of inverted beta-liouville function with respect to each of its parameters.enter image…
0
votes
1 answer

Definition of "Bias" in Machine learning models

In the estimation of a parameter say the average of a population the definition of "bias" is very clear. It is the difference between the average estimator value (averaged over random samples) and the true value of the parameter. In machine learning…
Daniel
  • 75
  • 7
0
votes
0 answers

Whitening Transformation in LDA

I have a question about the process for finding the optimal subspaces for LDA. A detailed process can be found in 114p. of 'The Elements of Statistical Learning' I want to know what the role of 'Whitening Transformation' in this process is. I think…
0
votes
1 answer

Hard SVM (distance between point and hyperplane)

While studying Hard-SVM topic in Shalev-Shwartz book I came across the following proof for the distance between point and hyperplane $$\min\{\|\pmb x-\pmb v\|: \langle\pmb w,\pmb v\rangle + b = 0\}\\ \text{Taking }\ \pmb v = \pmb x\ - (\langle \pmb…
ASR
  • 1
0
votes
1 answer

How to interpret this objective function

I'm terrible at interpreting math formulas and would like to ask for some help. I am going through the Scikit-learn library for machine learning in python and stumbled upon this formula: This is the object function for Lasso linear…
0
votes
2 answers

Log-likelihood, Machine learning

I'm referring to this practice problem here, https://davidrosenberg.github.io/mlcourse/ConceptChecks/10-Lab-Check_sol.pdf In particular, on Page $2$, the second equation of the solution, when it used log to find the likelihood function, I don't…
lly ke
  • 15
0
votes
1 answer

How to minimize objective function?

Fit a line with zero $y$-intercept ($\hat{y} = ax)$ on the curve $y=x^2+x$. Instead of minimizing the sum of squares of the errors, minimize the following objective function: $$\sum_i \left[ \left(\frac{y^i}{\hat{y}^i}\right)^2 +…
Adam T
  • 3
0
votes
1 answer

lasso approach to compute lasso estimate

Consider the following LASSO problem $min_{\beta} \sum\limits_{i=1}^n(y_{i}-\sum\limits_{j=1}^p x_{ij}\beta_{j})^2$ , subject to $\sum\limits_{j=1}^p|\beta_{j}|\leq t$ where $t \geq 0$ is a constant. (a) If $t = 0$, compute…